[
https://issues.apache.org/jira/browse/HIVE-26633?focusedWorklogId=817810&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-817810
]
ASF GitHub Bot logged work on HIVE-26633:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 17/Oct/22 20:35
Start Date: 17/Oct/22 20:35
Worklog Time Spent: 10m
Work Description: amansinha100 commented on code in PR #3674:
URL: https://github.com/apache/hive/pull/3674#discussion_r997487069
##########
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:
##########
@@ -2913,7 +2913,10 @@ public static enum ConfVars {
HIVE_STATS_MAX_NUM_STATS("hive.stats.max.num.stats", (long) 10000,
"When the number of stats to be updated is huge, this value is used to
control the number of \n" +
" stats to be sent to HMS for update."),
-
+ HIVE_THRIFT_MAX_MESSAGE_SIZE("hive.thrift.max.message.size", "1gb",
Review Comment:
I see.. I agree that hive.server2.thrift.max.message.size is getting used
for a different purpose (for the stringLengthLimit and containerLengthLimit)
and it would not be appropriate to overload that for our use (I didn't realize
earlier that this setting has been around since 8 years or so).
Regarding the naming for the new setting, currently I don't have a better
suggestion. I can foresee fair amount of confusion about these 2 config
options. Let me give it some more thought.
Issue Time Tracking
-------------------
Worklog Id: (was: 817810)
Time Spent: 1h 10m (was: 1h)
> Make thrift max message size configurable
> -----------------------------------------
>
> Key: HIVE-26633
> URL: https://issues.apache.org/jira/browse/HIVE-26633
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Affects Versions: 4.0.0-alpha-2
> Reporter: John Sherman
> Assignee: John Sherman
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> Since thrift >= 0.14, thrift now enforces max message sizes through a
> TConfiguration object as described here:
> [https://github.com/apache/thrift/blob/master/doc/specs/thrift-tconfiguration.md]
> By default MaxMessageSize gets set to 100MB.
> As a result it is possible for HMS clients not to be able to retrieve certain
> metadata for tables with a large amount of partitions or other metadata.
> For example on a cluster configured with kerberos between hs2 and hms,
> querying a large table (10k partitions, 200 columns with names of 200
> characters) results in this backtrace:
> {code:java}
> org.apache.thrift.transport.TTransportException: MaxMessageSize reached
> at
> org.apache.thrift.transport.TEndpointTransport.countConsumedMessageBytes(TEndpointTransport.java:96)
>
> at
> org.apache.thrift.transport.TMemoryInputTransport.read(TMemoryInputTransport.java:97)
>
> at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:390)
> at
> org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:39)
>
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:109)
> at
> org.apache.hadoop.hive.metastore.security.TFilterTransport.readAll(TFilterTransport.java:63)
>
> at
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:464)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readByte(TBinaryProtocol.java:329)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readFieldBegin(TBinaryProtocol.java:273)
>
> at
> org.apache.hadoop.hive.metastore.api.FieldSchema$FieldSchemaStandardScheme.read(FieldSchema.java:461)
>
> at
> org.apache.hadoop.hive.metastore.api.FieldSchema$FieldSchemaStandardScheme.read(FieldSchema.java:454)
>
> at
> org.apache.hadoop.hive.metastore.api.FieldSchema.read(FieldSchema.java:388)
> at
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.read(StorageDescriptor.java:1269)
>
> at
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.read(StorageDescriptor.java:1248)
>
> at
> org.apache.hadoop.hive.metastore.api.StorageDescriptor.read(StorageDescriptor.java:1110)
>
> at
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.read(Partition.java:1270)
>
> at
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.read(Partition.java:1205)
>
> at org.apache.hadoop.hive.metastore.api.Partition.read(Partition.java:1062)
> at
> org.apache.hadoop.hive.metastore.api.PartitionsByExprResult$PartitionsByExprResultStandardScheme.read(PartitionsByExprResult.java:420)
>
> at
> org.apache.hadoop.hive.metastore.api.PartitionsByExprResult$PartitionsByExprResultStandardScheme.read(PartitionsByExprResult.java:399)
>
> at
> org.apache.hadoop.hive.metastore.api.PartitionsByExprResult.read(PartitionsByExprResult.java:335)
>
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_by_expr_result$get_partitions_by_expr_resultStandardScheme.read(ThriftHiveMetastore.java)
> {code}
> Making this configurable (and defaulting to a higher value) would allow these
> tables to still be accessible.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)