Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20657 )
Change subject: IMPALA-12525: Fix flaky test test_statestored_manual_failover ...................................................................... Patch Set 5: (6 comments) http://gerrit.cloudera.org:8080/#/c/20657/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20657/4//COMMIT_MSG@14 PS4, Line 14: all subscribers re-registered nit: some subscribers to not receive http://gerrit.cloudera.org:8080/#/c/20657/4/be/src/statestore/statestore-subscriber.cc File be/src/statestore/statestore-subscriber.cc: http://gerrit.cloudera.org:8080/#/c/20657/4/be/src/statestore/statestore-subscriber.cc@387 PS4, Line 387: *active_statestore_conn_state = statestore2_->GetStatestoreConnState(); This patch remove the condition that the heartbeating statestored must not be an active one if request_active_conn_state is True. Should it log WARNING if the heartbeating statestored is still known as an active one? http://gerrit.cloudera.org:8080/#/c/20657/4/be/src/statestore/statestore-subscriber.cc@377 PS4, Line 377: if (statestore_->IsMatchingStatestoreId(statestore_id) || statestore2_ == nullptr) { : // Due to race, subscriber could receive heartbeat before receiving registration : // response. "statestore2_ == nullptr" means HA is not enabled. In that case, : // we still handle the heartbeat request from statestore even registration response : // is not received. : statestore_->Heartbeat(registration_id); : // Report connection state with active statestore instance for the request from : // standby statestore. It's possible that the notification of active statestored : // change has not been received. : if (request_active_conn_state && statestore2_ != nullptr) { : *active_statestore_conn_state = statestore2_->GetStatestoreConnState(); : } : } else if (statestore2_ != nullptr : && statestore2_->IsMatchingStatestoreId(statestore_id)) { : statestore2_->Heartbeat(registration_id); : // Report connection state with active statestore instance for the request from : // standby statestore. It's possible that the notification of active statestored : // change has not been received. : if (request_active_conn_state) { : *active_statestore_conn_state = statestore_->GetStatestoreConnState(); : } nit: This part of code confuse me a bit. I think it will be clearer to have local pointer variable that clarify the heartbeating statestored vs the other one (maybe, heartbeating_statestored & other_statestored pointer). http://gerrit.cloudera.org:8080/#/c/20657/4/be/src/statestore/statestore-subscriber.cc@445 PS4, Line 445: active_statestore = GetActiveStatestore(); : active_statestore->SetStatestoreActive(is_active, active_statestored_version); : standby_statestore = GetStandbyStatestore(); : standby_statestore->SetStatestoreActive(!is_active, active_statestored_version); Why this active/standby set/switch does not hold statestore_ha_lock_? Are they need to be atomic or not? http://gerrit.cloudera.org:8080/#/c/20657/4/be/src/statestore/statestore-subscriber.cc@461 PS4, Line 461: // Active/standby statestored are not set. This could happen if statestoreds were : // started after subscribers' registration attemption. Add WARNING log for this case. http://gerrit.cloudera.org:8080/#/c/20657/4/tests/custom_cluster/test_statestored_ha.py File tests/custom_cluster/test_statestored_ha.py: http://gerrit.cloudera.org:8080/#/c/20657/4/tests/custom_cluster/test_statestored_ha.py@681 PS4, Line 681: execute_query nit: Maybe use execute_query_expect_success() here? -- To view, visit http://gerrit.cloudera.org:8080/20657 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If03bf09d22a2875d2c1eec8a4f62eeefc5d855dc Gerrit-Change-Number: 20657 Gerrit-PatchSet: 5 Gerrit-Owner: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Tue, 07 Nov 2023 23:37:52 +0000 Gerrit-HasComments: Yes
