[ 
https://issues.apache.org/jira/browse/IMPALA-12235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenzhe Zhou resolved IMPALA-12235.
----------------------------------
    Fix Version/s: Impala 4.3.0
       Resolution: Fixed

> test_multiple_coordinator() failed because _start_impala_cluster() returned 
> non-zero exit status
> ------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-12235
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12235
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Fang-Yu Rao
>            Assignee: Wenzhe Zhou
>            Priority: Major
>              Labels: broken-build
>             Fix For: Impala 4.3.0
>
>
> We found that test_multiple_coordinator() could fail because 
> [_start_impala_cluster()|https://github.com/apache/impala/blame/master/tests/common/custom_cluster_test_suite.py#L283]
>  returned non-zero exit status. test_multiple_coordinator() callsĀ 
> test_multiple_coordinator() at 
> https://github.com/apache/impala/blame/master/tests/custom_cluster/test_coordinators.py#L41C10-L41C31.
> *Error Message*
> {code:java}
> CalledProcessError: Command 
> '['/data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/bin/start-impala-cluster.py',
>  '--state_store_args=--statestore_update_frequency_ms=50     
> --statestore_priority_update_frequency_ms=50     
> --statestore_heartbeat_frequency_ms=50', '--cluster_size=3', 
> '--num_coordinators=2', 
> '--log_dir=/data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/logs/custom_cluster_tests',
>  '--log_level=1', '--impalad_args=--default_query_options=']' returned 
> non-zero exit status 1
> {code}
> *Stacktrace*
> {code:java}
> custom_cluster/test_coordinators.py:41: in test_multiple_coordinators
>     self._start_impala_cluster([], num_coordinators=2, cluster_size=3)
> common/custom_cluster_test_suite.py:330: in _start_impala_cluster
>     check_call(cmd + options, close_fds=True)
> /data/jenkins/workspace/impala-asf-master-core-erasure-coding/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/subprocess.py:190:
>  in check_call
>     raise CalledProcessError(retcode, cmd)
> E   CalledProcessError: Command 
> '['/data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/bin/start-impala-cluster.py',
>  '--state_store_args=--statestore_update_frequency_ms=50     
> --statestore_priority_update_frequency_ms=50     
> --statestore_heartbeat_frequency_ms=50', '--cluster_size=3', 
> '--num_coordinators=2', 
> '--log_dir=/data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/logs/custom_cluster_tests',
>  '--log_level=1', '--impalad_args=--default_query_options=']' returned 
> non-zero exit status 1
> {code}
> The following console output shows that 'num_known_live_backends' could not 
> reach 3 in 4 mins and thus the command that starts the cluster failed with 
> non-zero exit status.
> {code}
> -- 2023-06-21 20:54:40,594 INFO     MainThread: Starting cluster with 
> command: 
> /data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/bin/start-impala-cluster.py
>  '--state_store_args=--statestore_update_frequency_ms=50     
> --statestore_priority_update_frequency_ms=50     
> --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=2 
> --log_dir=/data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/logs/custom_cluster_tests
>  --log_level=1 --impalad_args=--default_query_options=
> 20:54:41 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
> 20:54:41 MainThread: Starting State Store logging to 
> /data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/logs/custom_cluster_tests/statestored.INFO
> 20:54:42 MainThread: Starting Catalog Service logging to 
> /data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
> 20:54:43 MainThread: Starting Impala Daemon logging to 
> /data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/logs/custom_cluster_tests/impalad.INFO
> 20:54:43 MainThread: Starting Impala Daemon logging to 
> /data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
> 20:54:43 MainThread: Starting Impala Daemon logging to 
> /data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
> 20:54:46 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 20:54:46 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 20:54:46 MainThread: Getting num_known_live_backends from 
> impala-ec2-centos79-m6i-4xlarge-ondemand-1576.vpc.cloudera.com:25000
> 20:54:46 MainThread: Waiting for num_known_live_backends=3. Current value: 1
> 20:54:47 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 20:54:47 MainThread: Getting num_known_live_backends from 
> impala-ec2-centos79-m6i-4xlarge-ondemand-1576.vpc.cloudera.com:25000
> 20:54:47 MainThread: Waiting for num_known_live_backends=3. Current value: 1
> 20:54:48 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 20:54:48 MainThread: Getting num_known_live_backends from 
> impala-ec2-centos79-m6i-4xlarge-ondemand-1576.vpc.cloudera.com:25000
> 20:54:48 MainThread: num_known_live_backends has reached value: 3
> 20:54:48 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 20:54:48 MainThread: Getting num_known_live_backends from 
> impala-ec2-centos79-m6i-4xlarge-ondemand-1576.vpc.cloudera.com:25001
> 20:54:48 MainThread: Waiting for num_known_live_backends=3. Current value: 2
> ...
> 20:58:48 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 20:58:48 MainThread: Getting num_known_live_backends from 
> impala-ec2-centos79-m6i-4xlarge-ondemand-1576.vpc.cloudera.com:25001
> 20:58:48 MainThread: Waiting for num_known_live_backends=3. Current value: 2
> 20:58:49 MainThread: Error starting cluster
> Traceback (most recent call last):
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/bin/start-impala-cluster.py",
>  line 931, in <module>
>     expected_cluster_size - expected_catalog_delays)
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/tests/common/impala_cluster.py",
>  line 205, in wait_until_ready
>     early_abort_fn=check_processes_still_running)
>   File 
> "/data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/tests/common/impala_service.py",
>  line 374, in wait_for_num_known_live_backends
>     assert 0, 'num_known_live_backends did not reach expected value in time'
> AssertionError: num_known_live_backends did not reach expected value in time
> -- 2023-06-21 20:58:49,141 DEBUG    MainThread: Found 3 impalad/1 
> statestored/1 catalogd process(es)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to