[ https://issues.apache.org/jira/browse/IMPALA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711843#comment-16711843 ]
Tim Armstrong commented on IMPALA-7931: --------------------------------------- {noformat} 12:51:11 =================================== FAILURES =================================== 12:51:11 __________________ TestShutdownCommand.test_shutdown_executor __________________ 12:51:11 custom_cluster/test_restart_services.py:209: in test_shutdown_executor 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, before_shutdown_handle) == 3 12:51:11 custom_cluster/test_restart_services.py:356: in __fetch_and_get_num_backends 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20) 12:51:11 common/impala_service.py:267: in wait_for_query_state 12:51:11 target_state, query_state) 12:51:11 E AssertionError: Did not reach query state in time target=4 actual=5 12:51:11 ---------------------------- Captured stderr setup ----------------------------- 12:51:11 -- 2018-12-04 12:31:17,879 INFO MainThread: Starting cluster with command: /data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/bin/start-impala-cluster.py --cluster_size=3 --num_coordinators=3 --log_dir=/data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/logs/custom_cluster_tests --log_level=1 '--impalad_args="--shutdown_grace_period_s=5 --shutdown_deadline_s=10 --hostname=impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com" ' '--state_store_args="--statestore_update_frequency_ms=50 --statestore_priority_update_frequency_ms=50 --statestore_heartbeat_frequency_ms=50" ' --impalad_args=--default_query_options= 12:51:11 12:31:18 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/logs/custom_cluster_tests/statestored.INFO 12:51:11 12:31:19 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/logs/custom_cluster_tests/catalogd.INFO 12:51:11 12:31:20 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.INFO 12:51:11 12:31:21 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO 12:51:11 12:31:22 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO 12:51:11 12:31:25 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 12:51:11 12:31:25 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25000 12:51:11 12:31:25 MainThread: Waiting for num_known_live_backends=3. Current value: 0 12:51:11 12:31:26 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25000 12:51:11 12:31:26 MainThread: Waiting for num_known_live_backends=3. Current value: 0 12:51:11 12:31:27 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25000 12:51:11 12:31:27 MainThread: Waiting for num_known_live_backends=3. Current value: 2 12:51:11 12:31:28 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25000 12:51:11 12:31:28 MainThread: Waiting for num_known_live_backends=3. Current value: 2 12:51:11 12:31:29 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25000 12:51:11 12:31:29 MainThread: num_known_live_backends has reached value: 3 12:51:11 12:31:29 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25001 12:51:11 12:31:29 MainThread: num_known_live_backends has reached value: 3 12:51:11 12:31:30 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25002 12:51:11 12:31:30 MainThread: num_known_live_backends has reached value: 3 12:51:11 12:31:30 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 executors). 12:51:11 -- 2018-12-04 12:31:31,000 INFO MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 12:51:11 -- 2018-12-04 12:31:31,000 INFO MainThread: Getting metric: statestore.live-backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25010 12:51:11 -- 2018-12-04 12:31:31,001 INFO MainThread: Starting new HTTP connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com 12:51:11 -- 2018-12-04 12:31:31,003 INFO MainThread: Metric 'statestore.live-backends' has reached desired value: 4 12:51:11 -- 2018-12-04 12:31:31,003 INFO MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25000 12:51:11 -- 2018-12-04 12:31:31,004 INFO MainThread: Starting new HTTP connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com 12:51:11 -- 2018-12-04 12:31:31,005 INFO MainThread: num_known_live_backends has reached value: 3 12:51:11 -- 2018-12-04 12:31:31,005 INFO MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25001 12:51:11 -- 2018-12-04 12:31:31,005 INFO MainThread: Starting new HTTP connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com 12:51:11 -- 2018-12-04 12:31:31,006 INFO MainThread: num_known_live_backends has reached value: 3 12:51:11 -- 2018-12-04 12:31:31,006 INFO MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25002 12:51:11 -- 2018-12-04 12:31:31,007 INFO MainThread: Starting new HTTP connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com 12:51:11 -- 2018-12-04 12:31:31,008 INFO MainThread: num_known_live_backends has reached value: 3 12:51:11 -- connecting to: localhost:21000 12:51:11 -- connecting to localhost:21050 with impyla 12:51:11 Conn <impala.hiveserver2.HiveServer2Connection object at 0x6538b50> 12:51:11 -- 2018-12-04 12:31:31,152 INFO MainThread: Closing active operation 12:51:11 ----------------------------- Captured stderr call ----------------------------- 12:51:11 SET num_scanner_threads=1; 12:51:11 -- executing async: localhost:21000 12:51:11 select count(*) from functional_parquet.alltypes where sleep(1) = bool_col; 12:51:11 12:51:11 -- 2018-12-04 12:31:33,382 INFO MainThread: Started query a34c3a84775e5599:b2b25eb900000000 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- executing against localhost:21000 12:51:11 : shutdown('localhost:22001'); 12:51:11 12:51:11 -- 2018-12-04 12:31:34,391 INFO MainThread: Started query 8f4d995b47a934be:a5b5cad000000000 12:51:11 -- 2018-12-04 12:31:34,395 INFO MainThread: Starting new HTTP connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com 12:51:11 -- 2018-12-04 12:31:34,397 INFO MainThread: Getting num_known_live_backends from localhost:25000 12:51:11 -- 2018-12-04 12:31:34,398 INFO MainThread: Starting new HTTP connection (1): localhost 12:51:11 -- 2018-12-04 12:31:34,399 INFO MainThread: Waiting for num_known_live_backends=2. Current value: 3 12:51:11 -- 2018-12-04 12:31:35,400 INFO MainThread: Getting num_known_live_backends from localhost:25000 12:51:11 -- 2018-12-04 12:31:35,401 INFO MainThread: Starting new HTTP connection (1): localhost 12:51:11 -- 2018-12-04 12:31:35,402 INFO MainThread: num_known_live_backends has reached value: 2 12:51:11 SET num_scanner_threads=1; 12:51:11 -- executing async: localhost:21000 12:51:11 select count(*) from functional_parquet.alltypes where sleep(1) = bool_col; 12:51:11 12:51:11 -- 2018-12-04 12:31:35,415 INFO MainThread: Started query cd4bd0b7518145c6:4bd1e0ee00000000 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x600ead0> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x600ead0> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> 12:51:11 -- getting state for operation: <tests.common.impala_connection.OperationHandle object at 0x5803f10> {noformat} > test_shutdown_executor fails with timeout waiting for query target state > ------------------------------------------------------------------------ > > Key: IMPALA-7931 > URL: https://issues.apache.org/jira/browse/IMPALA-7931 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure > Affects Versions: Impala 3.2.0 > Reporter: Lars Volker > Assignee: Tim Armstrong > Priority: Critical > Labels: broken-build > > On a recent S3 test run test_shutdown_executor hit a timeout waiting for a > query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION). > {noformat} > 12:51:11 __________________ TestShutdownCommand.test_shutdown_executor > __________________ > 12:51:11 custom_cluster/test_restart_services.py:209: in > test_shutdown_executor > 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, > before_shutdown_handle) == 3 > 12:51:11 custom_cluster/test_restart_services.py:356: in > __fetch_and_get_num_backends > 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20) > 12:51:11 common/impala_service.py:267: in wait_for_query_state > 12:51:11 target_state, query_state) > 12:51:11 E AssertionError: Did not reach query state in time target=4 > actual=5 > {noformat} > From the logs I can see that the query fails because one of the executors > becomes unreachable: > {noformat} > I1204 12:31:39.954125 5609 impala-server.cc:1792] Query > a34c3a84775e5599:b2b25eb900000000: Failed due to unreachable impalad(s): > jenkins-worker:22001 > {noformat} > The query was {{select count\(*) from functional_parquet.alltypes where > sleep(1) = bool_col}}. > It seems that the query took longer than expected and was still running when > the executor shut down. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org