[
https://issues.apache.org/jira/browse/IMPALA-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17416095#comment-17416095
]
Qifan Chen edited comment on IMPALA-10895 at 9/16/21, 12:55 PM:
----------------------------------------------------------------
This bug was found recurring in a core asan test recently.
{code:java}
custom_cluster/test_query_retries.py:742: in test_retrying_query_cancel
assert retry_status.group(1) == 'RETRYING'
E assert 'RETRIED' == 'RETRYING'
E - RETRIED
E + RETRYING
SET
client_identifier=custom_cluster/test_query_retries.py::TestQueryRetries::()::test_retrying_query_cancel;
SET retry_failed_queries=true;
-- executing async: localhost:21000
select count(*) from tpch_parquet.lineitem;
{code}
was (Author: sql_forever):
This bug was found recurring in a core asan test recently.
custom_cluster/test_query_retries.py:742: in test_retrying_query_cancel
assert retry_status.group(1) == 'RETRYING'
E assert 'RETRIED' == 'RETRYING'
E - RETRIED
E + RETRYING
SET
client_identifier=custom_cluster/test_query_retries.py::TestQueryRetries::()::test_retrying_query_cancel;
SET retry_failed_queries=true;
-- executing async: localhost:21000
select count(*) from tpch_parquet.lineitem;
> TestQueryRetries.test_retrying_query_cancel is flaky
> ----------------------------------------------------
>
> Key: IMPALA-10895
> URL: https://issues.apache.org/jira/browse/IMPALA-10895
> Project: IMPALA
> Issue Type: Bug
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
> Labels: broken-build
> Attachments: catalogd.INFO.gz, impalad.INFO.gz,
> impalad_node1.INFO.gz, impalad_node2.INFO.gz, statestored.INFO.gz
>
>
> Saw this failed in an ASAN build:
> {code}
> custom_cluster.test_query_retries.TestQueryRetries.test_retrying_query_cancel
> {code}
> Stacktrace
> {code}
> custom_cluster/test_query_retries.py:742: in test_retrying_query_cancel
> assert retry_status.group(1) == 'RETRYING'
> E assert 'RETRIED' == 'RETRYING'
> E - RETRIED
> E + RETRYING
> {code}
> Standard Error
> {code}
> -- 2021-08-29 08:29:29,112 INFO MainThread: Starting cluster with
> command:
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/bin/start-impala-cluster.py
> '--state_store_args=--statestore_update_frequency_ms=50
> --statestore_priority_update_frequency_ms=50
> --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3
> --log_dir=/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests
> --log_level=1
> '--impalad_args=--debug_actions=RETRY_DELAY_CHECKING_ORIGINAL_DRIVER:SLEEP@1000
> ' '--state_store_args=--statestore_heartbeat_frequency_ms=60000 '
> --impalad_args=--default_query_options=
> 08:29:29 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
> 08:29:29 MainThread: Starting State Store logging to
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/statestored.INFO
> 08:29:29 MainThread: Starting Catalog Service logging to
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
> 08:29:29 MainThread: Starting Impala Daemon logging to
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad.INFO
> 08:29:29 MainThread: Starting Impala Daemon logging to
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
> 08:29:29 MainThread: Starting Impala Daemon logging to
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
> 08:29:32 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 08:29:32 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 08:29:32 MainThread: Getting num_known_live_backends from
> impala-ec2-centos74-m5-4xlarge-ondemand-061e.vpc.cloudera.com:25000
> 08:29:32 MainThread: Debug webpage not yet available:
> HTTPConnectionPool(host='impala-ec2-centos74-m5-4xlarge-ondemand-061e.vpc.cloudera.com',
> port=25000): Max retries exceeded with url: /backends?json (Caused by
> NewConnectionError('<urllib3.connection.HTTPConnection object at
> 0x7f10c1d5d150>: Failed to establish a new connection: [Errno 111] Connection
> refused',))
> 08:29:34 MainThread: Debug webpage did not become available in expected time.
> 08:29:34 MainThread: Waiting for num_known_live_backends=3. Current value:
> None
> 08:29:35 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 08:29:35 MainThread: Getting num_known_live_backends from
> impala-ec2-centos74-m5-4xlarge-ondemand-061e.vpc.cloudera.com:25000
> 08:29:35 MainThread: Waiting for num_known_live_backends=3. Current value: 0
> 08:29:36 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 08:29:36 MainThread: Getting num_known_live_backends from
> impala-ec2-centos74-m5-4xlarge-ondemand-061e.vpc.cloudera.com:25000
> 08:29:36 MainThread: num_known_live_backends has reached value: 3
> 08:29:37 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 08:29:37 MainThread: Getting num_known_live_backends from
> impala-ec2-centos74-m5-4xlarge-ondemand-061e.vpc.cloudera.com:25001
> 08:29:37 MainThread: num_known_live_backends has reached value: 3
> 08:29:38 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 08:29:38 MainThread: Getting num_known_live_backends from
> impala-ec2-centos74-m5-4xlarge-ondemand-061e.vpc.cloudera.com:25002
> 08:29:38 MainThread: num_known_live_backends has reached value: 3
> 08:29:38 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3
> executors).
> -- 2021-08-29 08:29:38,626 DEBUG MainThread: Found 3 impalad/1
> statestored/1 catalogd process(es)
> -- 2021-08-29 08:29:38,626 INFO MainThread: Getting metric:
> statestore.live-backends from
> impala-ec2-centos74-m5-4xlarge-ondemand-061e.vpc.cloudera.com:25010
> -- 2021-08-29 08:29:38,629 INFO MainThread: Metric
> 'statestore.live-backends' has reached desired value: 4
> -- 2021-08-29 08:29:38,629 DEBUG MainThread: Getting
> num_known_live_backends from
> impala-ec2-centos74-m5-4xlarge-ondemand-061e.vpc.cloudera.com:25000
> -- 2021-08-29 08:29:38,631 INFO MainThread: num_known_live_backends has
> reached value: 3
> -- 2021-08-29 08:29:38,631 DEBUG MainThread: Getting
> num_known_live_backends from
> impala-ec2-centos74-m5-4xlarge-ondemand-061e.vpc.cloudera.com:25001
> -- 2021-08-29 08:29:38,633 INFO MainThread: num_known_live_backends has
> reached value: 3
> -- 2021-08-29 08:29:38,633 DEBUG MainThread: Getting
> num_known_live_backends from
> impala-ec2-centos74-m5-4xlarge-ondemand-061e.vpc.cloudera.com:25002
> -- 2021-08-29 08:29:38,635 INFO MainThread: num_known_live_backends has
> reached value: 3
> SET
> client_identifier=custom_cluster/test_query_retries.py::TestQueryRetries::()::test_retrying_query_cancel;
> -- connecting to: localhost:21000
> -- connecting to localhost:21050 with impyla
> -- 2021-08-29 08:29:38,801 INFO MainThread: Closing active operation
> -- connecting to localhost:28000 with impyla
> -- 2021-08-29 08:29:38,822 INFO MainThread: Closing active operation
> -- 2021-08-29 08:29:38,853 INFO MainThread: Killing <ImpaladProcess PID:
> 6501
> (/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/latest/service/impalad
> -disconnected_session_timeout 21600 -kudu_client_rpc_timeout_ms 60000
> -kudu_master_hosts localhost -mem_limit=12884901888 -logbufsecs=5 -v=1
> -max_log_files=0 -log_filename=impalad_node1
> -log_dir=/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests
> -beeswax_port=21001 -hs2_port=21051 -hs2_http_port=28001 -krpc_port=27001
> -state_store_subscriber_port=23001 -webserver_port=25001
> --debug_actions=RETRY_DELAY_CHECKING_ORIGINAL_DRIVER:SLEEP@1000
> --default_query_options=)> with signal 9
> SET
> client_identifier=custom_cluster/test_query_retries.py::TestQueryRetries::()::test_retrying_query_cancel;
> SET retry_failed_queries=true;
> -- executing async: localhost:21000
> select count(*) from tpch_parquet.lineitem;
> -- 2021-08-29 08:29:39,647 INFO MainThread: Started query
> de46b6f56f935fa5:f4c701b200000000
> -- canceling operation: <tests.common.impala_connection.OperationHandle
> object at 0x7fb2a8026250>
> {code}
> CC [~xqhe]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]