[ 
https://issues.apache.org/jira/browse/HIVE-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang resolved HIVE-10410.
--------------------------------
    Resolution: Fixed

[~rich williams], thanks a lot for reporting the issue, and the verification. I 
marked the issue fixed for now.

> Apparent race condition in HiveServer2 causing intermittent query failures
> --------------------------------------------------------------------------
>
>                 Key: HIVE-10410
>                 URL: https://issues.apache.org/jira/browse/HIVE-10410
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 0.13.1
>         Environment: CDH 5.3.3
> CentOS 6.4
>            Reporter: Richard Williams
>         Attachments: HIVE-10410.1.patch
>
>
> On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC 
> occasionally trigger odd Thrift exceptions with messages such as "Read a 
> negative frame size (-2147418110)!" or "out of sequence response" in 
> HiveServer2's connections to the metastore. For certain metastore calls (for 
> example, showDatabases), these Thrift exceptions are converted to 
> MetaExceptions in HiveMetaStoreClient, which prevents RetryingMetaStoreClient 
> from retrying these calls and thus causes the failure to bubble out to the 
> JDBC client.
> Note that as far as we can tell, this issue appears to only affect queries 
> that are submitted with the runAsync flag on TExecuteStatementReq set to true 
> (which, in practice, seems to mean all JDBC queries), and it appears to only 
> manifest when HiveServer2 is using the new HTTP transport mechanism. When 
> both these conditions hold, we are able to fairly reliably reproduce the 
> issue by spawning about 100 simple, concurrent hive queries (we have been 
> using "show databases"), two or three of which typically fail. However, when 
> either of these conditions do not hold, we are no longer able to reproduce 
> the issue.
> Some example stack traces from the HiveServer2 logs:
> {noformat}
> 2015-04-16 13:54:55,486 ERROR hive.log: Got exception: 
> org.apache.thrift.transport.TTransportException Read a negative frame size 
> (-2147418110)!
> org.apache.thrift.transport.TTransportException: Read a negative frame size 
> (-2147418110)!
>         at 
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:435)
>         at 
> org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:414)
>         at 
> org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
>         at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>         at 
> org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62)
>         at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
>         at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
>         at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
>         at 
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587)
>         at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:837)
>         at 
> org.apache.sentry.binding.metastore.SentryHiveMetaStoreClient.getDatabases(SentryHiveMetaStoreClient.java:60)
>         at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90)
>         at com.sun.proxy.$Proxy6.getDatabases(Unknown Source)
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.getDatabasesByPattern(Hive.java:1139)
>         at 
> org.apache.hadoop.hive.ql.exec.DDLTask.showDatabases(DDLTask.java:2445)
>         at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:364)
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>         at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:957)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:145)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:69)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:200)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>         at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:502)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:213)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The above exception being converted into a MetaException and re-thrown:
> {noformat}
> 2015-04-16 13:54:55,486 ERROR hive.ql.exec.DDLTask: 
> org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got 
> exception: org.apache.thrift.transport.TTransportException Read a negative 
> frame size (-2147418110)!)
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.getDatabasesByPattern(Hive.java:1141)
>         at 
> org.apache.hadoop.hive.ql.exec.DDLTask.showDatabases(DDLTask.java:2445)
>         at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:364)
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>         at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:957)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:145)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:69)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:200)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>         at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:502)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:213)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: MetaException(message:Got exception: 
> org.apache.thrift.transport.TTransportException Read a negative frame size 
> (-2147418110)!)
>         at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:1116)
>         at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:839)
>         at 
> org.apache.sentry.binding.metastore.SentryHiveMetaStoreClient.getDatabases(SentryHiveMetaStoreClient.java:60)
>         at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90)
>         at com.sun.proxy.$Proxy6.getDatabases(Unknown Source)
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.getDatabasesByPattern(Hive.java:1139)
>         ... 22 more
> {noformat}
> The above MetaException causing the query as a whole to fail:
> {noformat}
> 2015-04-16 13:54:55,486 ERROR 
> org.apache.hive.service.cli.operation.Operation: Error running hive query:
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: 
> org.apache.thrift.transport.TTransportException Read a negative frame size 
> (-2147418110)!)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:148)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:69)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:200)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>         at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:502)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:213)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> An "out of sequence response" that occurred shortly after the above exception 
> and may have been triggered by it:
> {noformat}
> 2015-04-16 13:54:55,498 ERROR hive.log: Got exception: 
> org.apache.thrift.TApplicationException get_databases failed: out of sequence 
> response
> org.apache.thrift.TApplicationException: get_databases failed: out of 
> sequence response
>         at 
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587)
>         at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:837)
>         at 
> org.apache.sentry.binding.metastore.SentryHiveMetaStoreClient.getDatabases(SentryHiveMetaStoreClient.java:60)
>         at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90)
>         at com.sun.proxy.$Proxy6.getDatabases(Unknown Source)
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.getDatabasesByPattern(Hive.java:1139)
>         at 
> org.apache.hadoop.hive.ql.exec.DDLTask.showDatabases(DDLTask.java:2445)
>         at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:364)
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>         at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:957)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:145)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:69)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:200)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>         at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:502)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:213)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to