[
https://issues.apache.org/jira/browse/IMPALA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711843#comment-16711843
]
Tim Armstrong commented on IMPALA-7931:
---------------------------------------
{noformat}
12:51:11 =================================== FAILURES
===================================
12:51:11 __________________ TestShutdownCommand.test_shutdown_executor
__________________
12:51:11 custom_cluster/test_restart_services.py:209: in test_shutdown_executor
12:51:11 assert self.__fetch_and_get_num_backends(QUERY,
before_shutdown_handle) == 3
12:51:11 custom_cluster/test_restart_services.py:356: in
__fetch_and_get_num_backends
12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20)
12:51:11 common/impala_service.py:267: in wait_for_query_state
12:51:11 target_state, query_state)
12:51:11 E AssertionError: Did not reach query state in time target=4 actual=5
12:51:11 ---------------------------- Captured stderr setup
-----------------------------
12:51:11 -- 2018-12-04 12:31:17,879 INFO MainThread: Starting cluster with
command:
/data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/bin/start-impala-cluster.py
--cluster_size=3 --num_coordinators=3
--log_dir=/data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/logs/custom_cluster_tests
--log_level=1 '--impalad_args="--shutdown_grace_period_s=5
--shutdown_deadline_s=10
--hostname=impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com" '
'--state_store_args="--statestore_update_frequency_ms=50
--statestore_priority_update_frequency_ms=50
--statestore_heartbeat_frequency_ms=50" '
--impalad_args=--default_query_options=
12:51:11 12:31:18 MainThread: Starting State Store logging to
/data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/logs/custom_cluster_tests/statestored.INFO
12:51:11 12:31:19 MainThread: Starting Catalog Service logging to
/data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
12:51:11 12:31:20 MainThread: Starting Impala Daemon logging to
/data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.INFO
12:51:11 12:31:21 MainThread: Starting Impala Daemon logging to
/data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
12:51:11 12:31:22 MainThread: Starting Impala Daemon logging to
/data/jenkins/workspace/impala-cdh6.1.x-core-s3/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
12:51:11 12:31:25 MainThread: Found 3 impalad/1 statestored/1 catalogd
process(es)
12:51:11 12:31:25 MainThread: Getting num_known_live_backends from
impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25000
12:51:11 12:31:25 MainThread: Waiting for num_known_live_backends=3. Current
value: 0
12:51:11 12:31:26 MainThread: Getting num_known_live_backends from
impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25000
12:51:11 12:31:26 MainThread: Waiting for num_known_live_backends=3. Current
value: 0
12:51:11 12:31:27 MainThread: Getting num_known_live_backends from
impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25000
12:51:11 12:31:27 MainThread: Waiting for num_known_live_backends=3. Current
value: 2
12:51:11 12:31:28 MainThread: Getting num_known_live_backends from
impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25000
12:51:11 12:31:28 MainThread: Waiting for num_known_live_backends=3. Current
value: 2
12:51:11 12:31:29 MainThread: Getting num_known_live_backends from
impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25000
12:51:11 12:31:29 MainThread: num_known_live_backends has reached value: 3
12:51:11 12:31:29 MainThread: Getting num_known_live_backends from
impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25001
12:51:11 12:31:29 MainThread: num_known_live_backends has reached value: 3
12:51:11 12:31:30 MainThread: Getting num_known_live_backends from
impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25002
12:51:11 12:31:30 MainThread: num_known_live_backends has reached value: 3
12:51:11 12:31:30 MainThread: Impala Cluster Running with 3 nodes (3
coordinators, 3 executors).
12:51:11 -- 2018-12-04 12:31:31,000 INFO MainThread: Found 3 impalad/1
statestored/1 catalogd process(es)
12:51:11 -- 2018-12-04 12:31:31,000 INFO MainThread: Getting metric:
statestore.live-backends from
impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25010
12:51:11 -- 2018-12-04 12:31:31,001 INFO MainThread: Starting new HTTP
connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com
12:51:11 -- 2018-12-04 12:31:31,003 INFO MainThread: Metric
'statestore.live-backends' has reached desired value: 4
12:51:11 -- 2018-12-04 12:31:31,003 INFO MainThread: Getting
num_known_live_backends from
impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25000
12:51:11 -- 2018-12-04 12:31:31,004 INFO MainThread: Starting new HTTP
connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com
12:51:11 -- 2018-12-04 12:31:31,005 INFO MainThread:
num_known_live_backends has reached value: 3
12:51:11 -- 2018-12-04 12:31:31,005 INFO MainThread: Getting
num_known_live_backends from
impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25001
12:51:11 -- 2018-12-04 12:31:31,005 INFO MainThread: Starting new HTTP
connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com
12:51:11 -- 2018-12-04 12:31:31,006 INFO MainThread:
num_known_live_backends has reached value: 3
12:51:11 -- 2018-12-04 12:31:31,006 INFO MainThread: Getting
num_known_live_backends from
impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com:25002
12:51:11 -- 2018-12-04 12:31:31,007 INFO MainThread: Starting new HTTP
connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com
12:51:11 -- 2018-12-04 12:31:31,008 INFO MainThread:
num_known_live_backends has reached value: 3
12:51:11 -- connecting to: localhost:21000
12:51:11 -- connecting to localhost:21050 with impyla
12:51:11 Conn <impala.hiveserver2.HiveServer2Connection object at 0x6538b50>
12:51:11 -- 2018-12-04 12:31:31,152 INFO MainThread: Closing active
operation
12:51:11 ----------------------------- Captured stderr call
-----------------------------
12:51:11 SET num_scanner_threads=1;
12:51:11 -- executing async: localhost:21000
12:51:11 select count(*) from functional_parquet.alltypes where sleep(1) =
bool_col;
12:51:11
12:51:11 -- 2018-12-04 12:31:33,382 INFO MainThread: Started query
a34c3a84775e5599:b2b25eb900000000
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- executing against localhost:21000
12:51:11 : shutdown('localhost:22001');
12:51:11
12:51:11 -- 2018-12-04 12:31:34,391 INFO MainThread: Started query
8f4d995b47a934be:a5b5cad000000000
12:51:11 -- 2018-12-04 12:31:34,395 INFO MainThread: Starting new HTTP
connection (1): impala-ec2-centos74-m5-4xlarge-ondemand-0b8b.vpc.cloudera.com
12:51:11 -- 2018-12-04 12:31:34,397 INFO MainThread: Getting
num_known_live_backends from localhost:25000
12:51:11 -- 2018-12-04 12:31:34,398 INFO MainThread: Starting new HTTP
connection (1): localhost
12:51:11 -- 2018-12-04 12:31:34,399 INFO MainThread: Waiting for
num_known_live_backends=2. Current value: 3
12:51:11 -- 2018-12-04 12:31:35,400 INFO MainThread: Getting
num_known_live_backends from localhost:25000
12:51:11 -- 2018-12-04 12:31:35,401 INFO MainThread: Starting new HTTP
connection (1): localhost
12:51:11 -- 2018-12-04 12:31:35,402 INFO MainThread:
num_known_live_backends has reached value: 2
12:51:11 SET num_scanner_threads=1;
12:51:11 -- executing async: localhost:21000
12:51:11 select count(*) from functional_parquet.alltypes where sleep(1) =
bool_col;
12:51:11
12:51:11 -- 2018-12-04 12:31:35,415 INFO MainThread: Started query
cd4bd0b7518145c6:4bd1e0ee00000000
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x600ead0>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x600ead0>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
12:51:11 -- getting state for operation:
<tests.common.impala_connection.OperationHandle object at 0x5803f10>
{noformat}
> test_shutdown_executor fails with timeout waiting for query target state
> ------------------------------------------------------------------------
>
> Key: IMPALA-7931
> URL: https://issues.apache.org/jira/browse/IMPALA-7931
> Project: IMPALA
> Issue Type: Bug
> Components: Infrastructure
> Affects Versions: Impala 3.2.0
> Reporter: Lars Volker
> Assignee: Tim Armstrong
> Priority: Critical
> Labels: broken-build
>
> On a recent S3 test run test_shutdown_executor hit a timeout waiting for a
> query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION).
> {noformat}
> 12:51:11 __________________ TestShutdownCommand.test_shutdown_executor
> __________________
> 12:51:11 custom_cluster/test_restart_services.py:209: in
> test_shutdown_executor
> 12:51:11 assert self.__fetch_and_get_num_backends(QUERY,
> before_shutdown_handle) == 3
> 12:51:11 custom_cluster/test_restart_services.py:356: in
> __fetch_and_get_num_backends
> 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20)
> 12:51:11 common/impala_service.py:267: in wait_for_query_state
> 12:51:11 target_state, query_state)
> 12:51:11 E AssertionError: Did not reach query state in time target=4
> actual=5
> {noformat}
> From the logs I can see that the query fails because one of the executors
> becomes unreachable:
> {noformat}
> I1204 12:31:39.954125 5609 impala-server.cc:1792] Query
> a34c3a84775e5599:b2b25eb900000000: Failed due to unreachable impalad(s):
> jenkins-worker:22001
> {noformat}
> The query was {{select count\(*) from functional_parquet.alltypes where
> sleep(1) = bool_col}}.
> It seems that the query took longer than expected and was still running when
> the executor shut down.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]