[ 
https://issues.apache.org/jira/browse/IMPALA-9345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17032007#comment-17032007
 ] 

Tim Armstrong commented on IMPALA-9345:
---------------------------------------

{noformat}
00:28:27 query_test/test_scanners.py::TestOrc::test_type_conversions[protocol: 
beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
orc/def/block] 
...
00:49:11 [gw6] FAILED 
query_test/test_scanners.py::TestOrc::test_type_conversions[protocol: beeswax | 
exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
orc/def/block] 
...
00:50:37 E   OperationalError: Error while processing statement: FAILED: 
Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
{noformat}

I found the problematic query in  [^hive-server2.log].
{noformat}
2020-02-06T00:28:54,152  INFO [253a71f4-6a27-4f1c-91ad-014149e62080 
HiveServer2-Handler-Pool: Thread-49] ql.Driver: Compiling 
command(queryId=jenkins_20200206002854_a52253a6-9e52-407e-906a-2db702372df4):
insert into test_type_conversions_3e81315e.union_complextypes
  select 0, create_union(1, 50, false), array(0), map('key0', 0)
2020-02-06T00:28:54,870  INFO [253a71f4-6a27-4f1c-91ad-014149e62080 
HiveServer2-Handler-Pool: Thread-49] ql.Driver: Completed compiling 
command(queryId=jenkins_20200206002854_a52253a6-9e52-407e-906a-2db702372df4); 
Time taken: 0.718 seconds
2020-02-06T00:28:54,881  INFO [HiveServer2-Background-Pool: Thread-54] 
lockmgr.DbTxnManager: Setting lock request transaction to txnid:2605 for 
queryId=jenkins_20200206002854_a52253a6-9e52-407e-906a-2db702372df4
2020-02-06T00:28:54,881  INFO [HiveServer2-Background-Pool: Thread-54] 
lockmgr.DbLockManager: Requesting: 
queryId=jenkins_20200206002854_a52253a6-9e52-407e-906a-2db702372df4 
LockRequest(component:[LockComponent(type:SHARED_READ, level:TABLE, 
dbname:_dummy_database, tablename:_dummy_table, operationType:SELECT, 
isTransactional:false)], txnid:2605, user:jenkins, 
hostname:impala-ec2-centos74-m5-4xlarge-ondemand-146e.vpc.cloudera.com, 
agentInfo:jenkins_20200206002854_a52253a6-9e52-407e-906a-2db702372df4)
2020-02-06T00:28:54,913  INFO [HiveServer2-Background-Pool: Thread-54] 
lockmgr.DbLockManager: Response to 
queryId=jenkins_20200206002854_a52253a6-9e52-407e-906a-2db702372df4 
LockResponse(lockid:1290, state:ACQUIRED)
2020-02-06T00:28:54,915  INFO [HiveServer2-Background-Pool: Thread-54] 
ql.Driver: Executing 
command(queryId=jenkins_20200206002854_a52253a6-9e52-407e-906a-2db702372df4):
insert into test_type_conversions_3e81315e.union_complextypes
  select 0, create_union(1, 50, false), array(0), map('key0', 0)
2020-02-06T00:28:54,915  INFO [HiveServer2-Background-Pool: Thread-54] 
ql.Driver: Query ID = 
jenkins_20200206002854_a52253a6-9e52-407e-906a-2db702372df4
2020-02-06T00:28:54,915  INFO [HiveServer2-Background-Pool: Thread-54] 
ql.Driver: Total jobs = 3
2020-02-06T00:28:54,929  INFO [HiveServer2-Background-Pool: Thread-54] 
ql.Driver: Launching Job 1 out of 3
2020-02-06T00:28:54,929  INFO [HiveServer2-Background-Pool: Thread-54] 
ql.Driver: Starting task [Stage-1:MAPRED] in serial mode
2020-02-06T00:28:54,959  INFO [HiveServer2-Background-Pool: Thread-54] 
tez.TezConfigurationFactory: SSL conf : ssl-client.xml
2020-02-06T00:28:54,977  INFO [HiveServer2-Background-Pool: Thread-54] 
Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is 
deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
2020-02-06T00:28:55,007  INFO [HiveServer2-Background-Pool: Thread-54] 
tez.TezSessionPoolManager: QueueName: null nonDefaultUser: false 
defaultQueuePool: null hasInitialSessions: false
2020-02-06T00:28:55,021  INFO [HiveServer2-Background-Pool: Thread-54] 
tez.TezSessionPoolManager: Created new tez session for queue: null with session 
id: 59c163cd-d4c8-437e-baaf-6c6bac842b45
2020-02-06T00:28:55,022  INFO [HiveServer2-Background-Pool: Thread-54] 
tez.TezTask: Subscribed to counters: [] for queryId: 
jenkins_20200206002854_a52253a6-9e52-407e-906a-2db702372df4
2020-02-06T00:28:55,022  INFO [HiveServer2-Background-Pool: Thread-54] 
tez.TezTask: Tez session hasn't been created yet. Opening session
2020-02-06T00:28:55,022  INFO [HiveServer2-Background-Pool: Thread-54] 
tez.TezSessionState: User of session id 59c163cd-d4c8-437e-baaf-6c6bac842b45 is 
jenkins
2020-02-06T00:28:55,523  INFO [HiveServer2-Background-Pool: Thread-54] 
tez.DagUtils: Localizing resource because it does not exist: 
file:/data/jenkins/workspace/impala-cdpd-master-core-s3/repos/Impala/fe/target/dependency/postgresql-42.2.5.jar
 to dest: 
s3a://impala-test-uswest2-1/tmp/hive/jenkins/_tez_session_dir/59c163cd-d4c8-437e-baaf-6c6bac842b45-resources/postgresql-42.2.5.jar
2020-02-06T00:28:56,078  INFO [HiveServer2-Background-Pool: Thread-54] 
tez.DagUtils: Resource modification time: 1580977736002 for 
s3a://impala-test-uswest2-1/tmp/hive/jenkins/_tez_session_dir/59c163cd-d4c8-437e-baaf-6c6bac842b45-resources/postgresql-42.2.5.jar
2020-02-06T00:28:56,175  INFO [HiveServer2-Background-Pool: Thread-54] 
tez.TezSessionState: Created new resources: null
2020-02-06T00:28:56,231  INFO [HiveServer2-Background-Pool: Thread-54] 
tez.DagUtils: Jar dir is null / directory doesn't exist. Choosing 
HIVE_INSTALL_DIR - /user/jenkins/.hiveJars
2020-02-06T00:28:56,607  INFO [HiveServer2-Background-Pool: Thread-54] 
tez.TezSessionState: Computed sha: 
36bd724c4553e3e7605ce5432fe262bb789a7b1227c2c02d1bbed3c2cb633c37 for file: 
file:/data0/jenkins/workspace/impala-cdpd-master-core-s3/Impala-Toolchain/cdp_components-1617729/apache-hive-3.1.2000.7.0.2.0-212-bin/lib/hive-exec-3.1.2000.7.0.2.0-212.jar
 of length: 41.20MB in 363 ms
2020-02-06T00:28:56,643  INFO [HiveServer2-Background-Pool: Thread-54] 
tez.DagUtils: Resource modification time: 1573714647000 for 
s3a://impala-test-uswest2-1/user/jenkins/.hiveJars/hive-exec-3.1.2000.7.0.2.0-212-36bd724c4553e3e7605ce5432fe262bb789a7b1227c2c02d1bbed3c2cb633c37.jar
2020-02-06T00:28:56,691  INFO [HiveServer2-Background-Pool: Thread-54] 
client.TezClient: Tez Client Version: [ component=tez-api, 
version=0.9.1.7.0.2.0-212, revision=e297442499d6e603660e831cb455dd395dca32fa, 
SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, 
buildTime=2019-11-13T08:15:15Z ]
2020-02-06T00:28:56,732  INFO [HiveServer2-Background-Pool: Thread-54] 
client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
2020-02-06T00:28:57,073  INFO [HiveServer2-Background-Pool: Thread-54] 
client.TezClient: Session mode. Starting session.
2020-02-06T00:28:58,179  INFO [HiveServer2-Background-Pool: Thread-54] 
ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1000 MILLISECONDS)
...
2020-02-06T00:29:17,196  INFO [HiveServer2-Background-Pool: Thread-54] 
retry.RetryInvocationHandler: java.net.ConnectException: Your endpoint 
configuration is wrong; For more details see:  
http://wiki.apache.org/hadoop/UnsetHostnameOrPort, while invoking 
ApplicationClientProtocolPBClientImpl.getNewApplication over null after 1 
failover attempts. Trying to failover after sleeping for 27789ms.
...2020-02-06T00:49:08,645 ERROR [HiveServer2-Background-Pool: Thread-54] 
tez.TezTask: Failed to execute tez graph.
java.net.ConnectException: Your endpoint configuration is wrong; For more 
details see:  http://wiki.apache.org/hadoop/UnsetHostnameOrPort
        at sun.reflect.GeneratedConstructorAccessor60.newInstance(Unknown 
Source) ~[?:?]
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 ~[?:1.8.0_144]
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
~[?:1.8.0_144]
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831) 
~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:751) 
~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1557) 
~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at org.apache.hadoop.ipc.Client.call(Client.java:1499) 
~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at org.apache.hadoop.ipc.Client.call(Client.java:1396) 
~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
 ~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
 ~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at com.sun.proxy.$Proxy78.getNewApplication(Unknown Source) ~[?:?]
        at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:274)
 ~[hadoop-yarn-common-3.1.1.7.0.2.0-212.jar:?]
        at sun.reflect.GeneratedMethodAccessor41.invoke(Unknown Source) ~[?:?]
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_144]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
 ~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
 ~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
 ~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
 ~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
 ~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at com.sun.proxy.$Proxy79.getNewApplication(Unknown Source) ~[?:?]
        at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:270)
 ~[hadoop-yarn-client-3.1.1.7.0.2.0-212.jar:?]
        at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:278)
 ~[hadoop-yarn-client-3.1.1.7.0.2.0-212.jar:?]
        at 
org.apache.tez.client.TezYarnClient.createApplication(TezYarnClient.java:71) 
~[tez-api-0.9.1.7.0.2.0-212.jar:0.9.1.7.0.2.0-212]
        at 
org.apache.tez.client.TezClient.createApplication(TezClient.java:1152) 
~[tez-api-0.9.1.7.0.2.0-212.jar:0.9.1.7.0.2.0-212]
        at org.apache.tez.client.TezClient.start(TezClient.java:400) 
~[tez-api-0.9.1.7.0.2.0-212.jar:0.9.1.7.0.2.0-212]
        at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezSessionState.java:522)
 ~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:373)
 ~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:298)
 ~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.open(TezSessionPoolSession.java:106)
 ~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezTask.java:380)
 ~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:203) 
~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) 
~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) 
~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2302) 
~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1954) 
~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1627) 
~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1387) 
~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1381) 
~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) 
~[hive-exec-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:229)
 ~[hive-service-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
org.apache.hive.service.cli.operation.SQLOperation.access$600(SQLOperation.java:87)
 ~[hive-service-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:326)
 ~[hive-service-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at java.security.AccessController.doPrivileged(Native Method) 
~[?:1.8.0_144]
        at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_144]
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
 ~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:344)
 ~[hive-service-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_144]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_144]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_144]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_144]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_144]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_144]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
~[?:1.8.0_144]
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
~[?:1.8.0_144]
        at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) 
~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) 
~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:699) 
~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:803) 
~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413) 
~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1627) 
~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at org.apache.hadoop.ipc.Client.call(Client.java:1443) 
~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        ... 47 more
2020-02-06T00:49:08,646  INFO [HiveServer2-Background-Pool: Thread-54] 
reexec.ReOptimizePlugin: ReOptimization: retryPossible: false
2020-02-06T00:49:08,646 ERROR [HiveServer2-Background-Pool: Thread-54] 
ql.Driver: FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask
2020-02-06T00:49:08,646  INFO [HiveServer2-Background-Pool: Thread-54] 
ql.Driver: Completed executing 
command(queryId=jenkins_20200206002854_a52253a6-9e52-407e-906a-2db702372df4); 
Time taken: 1213.731 seconds
2020-02-06T00:49:08,646  INFO [HiveServer2-Background-Pool: Thread-54] 
lockmgr.DbTxnManager: Stopped heartbeat for query: 
jenkins_20200206002854_a52253a6-9e52-407e-906a-2db702372df4
2020-02-06T00:49:08,671 ERROR [HiveServer2-Background-Pool: Thread-54] 
operation.Operation: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask
        at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:352)
 ~[hive-service-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:231)
 ~[hive-service-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
org.apache.hive.service.cli.operation.SQLOperation.access$600(SQLOperation.java:87)
 ~[hive-service-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:326)
 ~[hive-service-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at java.security.AccessController.doPrivileged(Native Method) 
~[?:1.8.0_144]
        at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_144]
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
 ~[hadoop-common-3.1.1.7.0.2.0-212.jar:?]
        at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:344)
 ~[hive-service-3.1.2000.7.0.2.0-212.jar:3.1.2000.7.0.2.0-212]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_144]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_144]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_144]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_144]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_144]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_144]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
{noformat}

So that points towards a misconfiguration. Unsure why we wouldn't see this when 
running other hive queries.

> Hang in TestOrc::test_type_conversions when running create table in hive
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-9345
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9345
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Blocker
>              Labels: broken-build, flaky
>         Attachments: hive-server2.log, hive-server2.log
>
>
> {noformat}
> query_test.test_scanners.TestOrc.test_type_conversions[protocol: beeswax | 
> exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> orc/def/block] (from pytest)
> Failing for the past 1 build (Since Failed#93 )
> Took 2 hr 0 min.
> add description
> Error Message
> query_test/test_scanners.py:1337: in test_type_conversions     
> self.run_test_case('DataErrorsTest/orc-type-checks', vector, unique_database) 
> common/impala_test_suite.py:659: in run_test_case     result = exec_fn(query, 
> user=test_section.get('USER', '').strip() or None) 
> common/impala_test_suite.py:610: in __exec_in_hive     result = 
> h.execute(query, user=user) common/impala_connection.py:334: in execute     r 
> = self.__fetch_results(handle, profile_format=profile_format) 
> common/impala_connection.py:441: in __fetch_results     
> cursor._wait_to_finish() 
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py:405:
>  in _wait_to_finish     resp = 
> self._last_operation._rpc('GetOperationStatus', req) 
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py:992:
>  in _rpc     response = self._execute(func_name, request) 
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py:1009:
>  in _execute     return func(request) 
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py:517:
>  in GetOperationStatus     return self.recv_GetOperationStatus() 
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py:529:
>  in recv_GetOperationStatus     (fname, mtype, rseqid) = 
> iprot.readMessageBegin() 
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py:126:
>  in readMessageBegin     sz = self.readI32() 
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py:206:
>  in readI32     buff = self.trans.readAll(4) 
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py:58:
>  in readAll     chunk = self.read(sz - have) 
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/thrift_sasl/__init__.py:159:
>  in read     self._read_frame() 
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/thrift_sasl/__init__.py:163:
>  in _read_frame     header = read_all_compat(self._trans, 4) 
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/thrift_sasl/six.py:31:
>  in <lambda>     read_all_compat = lambda trans, sz: trans.readAll(sz) 
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py:58:
>  in readAll     chunk = self.read(sz - have) 
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/transport/TSocket.py:105:
>  in read     buff = self.handle.recv(sz) E   Failed: Timeout >7200s
> Stacktrace
> query_test/test_scanners.py:1337: in test_type_conversions
>     self.run_test_case('DataErrorsTest/orc-type-checks', vector, 
> unique_database)
> common/impala_test_suite.py:659: in run_test_case
>     result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
> common/impala_test_suite.py:610: in __exec_in_hive
>     result = h.execute(query, user=user)
> common/impala_connection.py:334: in execute
>     r = self.__fetch_results(handle, profile_format=profile_format)
> common/impala_connection.py:441: in __fetch_results
>     cursor._wait_to_finish()
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py:405:
>  in _wait_to_finish
>     resp = self._last_operation._rpc('GetOperationStatus', req)
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py:992:
>  in _rpc
>     response = self._execute(func_name, request)
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py:1009:
>  in _execute
>     return func(request)
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py:517:
>  in GetOperationStatus
>     return self.recv_GetOperationStatus()
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py:529:
>  in recv_GetOperationStatus
>     (fname, mtype, rseqid) = iprot.readMessageBegin()
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py:126:
>  in readMessageBegin
>     sz = self.readI32()
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py:206:
>  in readI32
>     buff = self.trans.readAll(4)
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py:58:
>  in readAll
>     chunk = self.read(sz - have)
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/thrift_sasl/__init__.py:159:
>  in read
>     self._read_frame()
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/thrift_sasl/__init__.py:163:
>  in _read_frame
>     header = read_all_compat(self._trans, 4)
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/infra/python/env/lib/python2.7/site-packages/thrift_sasl/six.py:31:
>  in <lambda>
>     read_all_compat = lambda trans, sz: trans.readAll(sz)
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py:58:
>  in readAll
>     chunk = self.read(sz - have)
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/Impala-Toolchain/thrift-0.9.3-p7/python/lib64/python2.7/site-packages/thrift/transport/TSocket.py:105:
>  in read
>     buff = self.handle.recv(sz)
> E   Failed: Timeout >7200s
> {noformat}
> This has happened on multiple builds, all running against S3. It seems to be 
> happening with high frequency on S3, but isn't deterministic - I saw this 
> commit both succeed and fail:
> {noformat}
>     IMPALA-9334: Log node blacklisting removals in ClusterMembershipMgr
> {noformat}
> All failures did occur after "    IMPALA-9242: Bump CDH_BUILD_NUMBER to 
> 1814051" but not sure if that's related or not.
> Sample hive log:
> {noformat}
> tarmstrong@tarmstrong-box:~/tmplogs/impala-asf-master-core-s3-data-cache-93/logs/cluster$
>  grep a6c81474e8c hive*
> grep: hive: Is a directory
> hive-server2.log:2020-01-28T11:10:49,558  INFO 
> [e00c5c82-1a05-4fc7-bed2-1c62751c2223 HiveServer2-Handler-Pool: Thread-41] 
> ql.Driver: Compiling 
> command(queryId=jenkins_20200128111049_91b0e648-7740-431f-88bb-a6c81474e8c9): 
> create external table test_type_conversions_3e81315e.union_complextypes(
> hive-server2.log:2020-01-28T11:10:50,798  INFO 
> [e00c5c82-1a05-4fc7-bed2-1c62751c2223 HiveServer2-Handler-Pool: Thread-41] 
> ql.Driver: Completed compiling 
> command(queryId=jenkins_20200128111049_91b0e648-7740-431f-88bb-a6c81474e8c9); 
> Time taken: 1.274 seconds
> hive-server2.log:2020-01-28T13:10:16,567  INFO 
> [e00c5c82-1a05-4fc7-bed2-1c62751c2223 HiveServer2-Handler-Pool: Thread-41] 
> operation.Operation: The running operation has been successfully interrupted: 
> jenkins_20200128111049_91b0e648-7740-431f-88bb-a6c81474e8c9
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to