Richard Williams created HIVE-10410:
---------------------------------------
Summary: Apparent race condition in HiveServer2 causing
intermittent query failures
Key: HIVE-10410
URL: https://issues.apache.org/jira/browse/HIVE-10410
Project: Hive
Issue Type: Bug
Components: HiveServer2
Affects Versions: 0.13.1
Environment: CDH 5.3.3
CentOS 6.5
Reporter: Richard Williams
On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC
occasionally trigger odd Thrift exceptions with messages such as "Read a
negative frame size (-2147418110)!" or "out of sequence response" in
HiveServer2's connections to the metastore. For certain metastore calls (for
example, showDatabases), these Thrift exceptions are converted to
MetaExceptions in HiveMetaStoreClient, which prevents RetryingMetaStoreClient
from retrying these calls and thus causes the failure to bubble out to the JDBC
client.
Note that as far as we can tell, this issue appears to only affect queries that
are submitted with the runAsync flag on TExecuteStatementReq set to true
(which, in practice, seems to mean all JDBC queries), and it appears to only
manifest when HiveServer2 is using the new HTTP transport mechanism. When both
these conditions hold, we are able to fairly reliably reproduce the issue by
spawning about 100 simple, concurrent hive queries (we have been using "show
databases"), two or three of which typically fail. However, when either of
these conditions do not hold, we are no longer able to reproduce the issue.
Some example stack traces from the HiveServer2 logs:
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)