[
https://issues.apache.org/jira/browse/IMPALA-12550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenzhe Zhou resolved IMPALA-12550.
----------------------------------
Fix Version/s: Impala 4.4.0
Resolution: Fixed
> test_statestored_auto_failover_with_disabling_network flaky
> -----------------------------------------------------------
>
> Key: IMPALA-12550
> URL: https://issues.apache.org/jira/browse/IMPALA-12550
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.4.0
> Reporter: Wenzhe Zhou
> Assignee: Wenzhe Zhou
> Priority: Major
> Fix For: Impala 4.4.0
>
>
> TestStatestoredHA.test_statestored_auto_failover_with_disabling_network
> failed with following stack trace when repeatedly run this test.
> tests/custom_cluster/test_statestored_ha.py:645: in
> test_statestored_auto_failover_with_disabling_network
> "statestore.in-ha-recovery-mode", expected_value=False, timeout=120)
> tests/common/impala_service.py:144: in wait_for_metric_value
> self.__metric_timeout_assert(metric_name, expected_value, timeout)
> tests/common/impala_service.py:213: in __metric_timeout_assert
> assert 0, assert_string
> E AssertionError: Metric statestore.in-ha-recovery-mode did not reach value
> False in 120s.
> From log messages, the issue was caused by the delay of HA Handshake RPC
> between two statestore instances. Sometimes the active statestore took a few
> minutes to response the handshake requests from standby statestore.
> This issue is different from IMPALA-12525, which was caused locking issue on
> subscribers side.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)