[
https://issues.apache.org/jira/browse/THRIFT-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17897197#comment-17897197
]
Sercan Tekin commented on THRIFT-3914:
--------------------------------------
I hit the same issue with Hive when authentication is enabled.
{code:java}
Exception in thread "pool-7-thread-7" java.lang.OutOfMemoryError
at
java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117)
at
java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
at
org.apache.thrift.transport.TSaslTransport.write(TSaslTransport.java:473)
at
org.apache.thrift.transport.TSaslServerTransport.write(TSaslServerTransport.java:42)
at
org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:227)
at
org.apache.hadoop.hive.metastore.api.FieldSchema$FieldSchemaStandardScheme.write(FieldSchema.java:517)
at
org.apache.hadoop.hive.metastore.api.FieldSchema$FieldSchemaStandardScheme.write(FieldSchema.java:456)
at
org.apache.hadoop.hive.metastore.api.FieldSchema.write(FieldSchema.java:394)
at
org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1423)
at
org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1250)
at
org.apache.hadoop.hive.metastore.api.StorageDescriptor.write(StorageDescriptor.java:1116)
at
org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:1033)
at
org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:890)
at
org.apache.hadoop.hive.metastore.api.Partition.write(Partition.java:786)
{code}
I believe that the reason is Java's conservative approach to max Array size,
please see
[this|https://github.com/openjdk/jdk/blob/0e0dfca21f64ecfcb3e5ed7cdc2a173834faa509/src/java.base/share/classes/java/io/InputStream.java#L307-L313].
Spark has followed the Java's approach to fix the issue
https://github.com/apache/spark/commit/e5a5921968c84601ce005a7785bdd08c41a2d862#diff-607488c104788f0156de87abab394cf33aa76148b1e3d122d328e165a25c1838R22
> TSaslServerTransport throws OOM due to BetaArrayOutputStream limitation
> -----------------------------------------------------------------------
>
> Key: THRIFT-3914
> URL: https://issues.apache.org/jira/browse/THRIFT-3914
> Project: Thrift
> Issue Type: Bug
> Components: Java - Library
> Affects Versions: 0.9.3
> Reporter: Chaoyu Tang
> Priority: Major
>
> TSaslServerTransport uses the BetaArrayOutputStream as its write buffer, but
> the BetaArrayOutputStream has buffer size limitation with maximum Integer
> MAX_VALUE (2,147,483,647) bytes. If it needs write the result exceeding this
> limitation, it will throw OutOfMemoryError with msg "Requested array size
> exceeds VM limit". Following is the stack trace from a Hive use case:
> {code}
> Exception in thread "pool-6-thread-9" java.lang.OutOfMemoryError: Requested
> array size exceeds VM limit
> at java.util.Arrays.copyOf(Arrays.java:2271)
> at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
> at
> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
> at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
> at
> org.apache.thrift.transport.TSaslTransport.write(TSaslTransport.java:476)
> at
> org.apache.thrift.transport.TSaslServerTransport.write(TSaslServerTransport.java:41)
> at
> org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:202)
> at
> org.apache.hadoop.hive.metastore.api.SerDeInfo$SerDeInfoStandardScheme.write(SerDeInfo.java:579)
> at
> org.apache.hadoop.hive.metastore.api.SerDeInfo$SerDeInfoStandardScheme.write(SerDeInfo.java:501)
> at
> org.apache.hadoop.hive.metastore.api.SerDeInfo.write(SerDeInfo.java:439)
> at
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1490)
> at
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1288)
> at
> org.apache.hadoop.hive.metastore.api.StorageDescriptor.write(StorageDescriptor.java:1154)
> at
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:1072)
> at
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:929)
> at
> org.apache.hadoop.hive.metastore.api.Partition.write(Partition.java:825)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result$get_partitions_resultStandardScheme.write(ThriftHiveMetastore.java)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result$get_partitions_resultStandardScheme.write(ThriftHiveMetastore.java)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.write(ThriftHiveMetastore.java:65485)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:707)
> at
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:702)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
> at
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:702)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)