[
https://issues.apache.org/jira/browse/IMPALA-12556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17785719#comment-17785719
]
ASF subversion and git services commented on IMPALA-12556:
----------------------------------------------------------
Commit 97eae40d10193b5cfb10ca4d2b4034dce44274d0 in impala's branch
refs/heads/master from wzhou-code
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=97eae40d1 ]
IMPALA-12556: Fix flaky test test_two_statestored_with_force_active
Test test_two_statestored_with_force_active failed occasionally by
cause of both statestore instances assigned with active roles.
This patch fixes the issue to handle the case that both statestore
instances are restarted with flag "statestore_force_active" in the
same way as both statestore instances are restarted without flag
"statestore_force_active".
Testing:
- Repeatedly ran test_two_statestored_with_force_active on Jenkins for
hundreds of times without failure.
- Repeatedly ran test_two_statestored_with_force_active on local
machine for thousand times without failure.
- Repeatedly ran all tests in test_statestored_ha.py for over 12 hours
on Jenkins without failure.
- Passed core tests.
Change-Id: I3e6f85233ff6fa747a6aa5ef8d093627885d20b2
Reviewed-on: http://gerrit.cloudera.org:8080/20699
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Wenzhe Zhou <[email protected]>
> test_two_statestored_with_force_active fails or flaky
> -----------------------------------------------------
>
> Key: IMPALA-12556
> URL: https://issues.apache.org/jira/browse/IMPALA-12556
> Project: IMPALA
> Issue Type: Bug
> Components: Distributed Exec
> Affects Versions: Impala 4.4.0
> Reporter: Laszlo Gaal
> Assignee: Wenzhe Zhou
> Priority: Blocker
>
> custom_cluster.test_statestored_ha.TestStatestoredHA.test_two_statestored_with_force_active
> failed in a precommit run.
> Symptom:
> {code}
> common/custom_cluster_test_suite.py:208: in setup_method
> self._start_impala_cluster(cluster_args, **kwargs)
> common/custom_cluster_test_suite.py:330: in _start_impala_cluster
> check_call(cmd + options, close_fds=True)
> ../toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/subprocess.py:190:
> in check_call
> raise CalledProcessError(retcode, cmd)
> E CalledProcessError: Command
> '['/home/ubuntu/Impala/bin/start-impala-cluster.py',
> '--state_store_args=--statestore_update_frequency_ms=50
> --statestore_priority_update_frequency_ms=50
> --statestore_heartbeat_frequency_ms=50', '--cluster_size=3',
> '--num_coordinators=3',
> '--log_dir=/home/ubuntu/Impala/logs/custom_cluster_tests', '--log_level=1',
> '--state_store_args=--statestore_force_active=true ',
> '--enable_statestored_ha', '--impalad_args=--default_query_options=']'
> returned non-zero exit status 1
> {code}
> The test dies with a FATAL log entry in catalogd's log:
> {code}
> DCHECK found in log file:
> /home/ubuntu/Impala/logs/custom_cluster_tests/catalogd.FATAL
> {code}
> {code}
> Log file created at: 2023/11/11 23:36:24
> Running on machine: ip-172-31-52-128
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> F1111 23:36:24.798915 2270244 statestore-subscriber.cc:336] Check failed:
> !statestore_is_active || !statestore2_is_active
> {code}
> Offending precommit run:
> https://jenkins.impala.io/job/ubuntu-20.04-from-scratch/874/ (preserved).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]