Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9795 )
Change subject: [linked_list-test] fix flake with the 3-4-3 scheme ...................................................................... [linked_list-test] fix flake with the 3-4-3 scheme The LinkedListTest.TestLoadWhileOneServerDownAndVerify scenario became flaky when running with --seconds_to_run set to about 800 and more once the 3-4-3 replica management scheme became the default one. Those cases (--seconds_to_run=X, X >= 800) are special because the stopped replica falls behind WAL segment GC threshold when running with the linked list input data, so the system automatically replaces the failed replica. In case of the 3-4-3 scheme, the newly added replica is added as a non-voter. The WaitForServersToAgree() looks only at the OpId indices, not distinguishing between voter and non-voter replicas. However, the verification phase of the scenario assumes the only replica left alive is a voter replica. Prior to this fix, the scenario didn't account for the case when the replica at the restarted tablet server was still a non-voter, and in most cases the rest 2 out of 3 tservers were shutdown before the newly added replica was promoted. As a result, the latter replica was left non-voter and the written data could not be read back from it. This patch adds a step to verify that all 3 replicas are registered as voters with the master(s) before shutting down the tservers hosting the source 2 replicas. The scenario is now stable when running with --seconds_to_run=900. Change-Id: I132206371e2935f1e0f39e9eacad866fde22c5b8 Reviewed-on: http://gerrit.cloudera.org:8080/9795 Tested-by: Alexey Serbin <[email protected]> Reviewed-by: Mike Percy <[email protected]> --- M src/kudu/integration-tests/linked_list-test.cc 1 file changed, 31 insertions(+), 14 deletions(-) Approvals: Alexey Serbin: Verified Mike Percy: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/9795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I132206371e2935f1e0f39e9eacad866fde22c5b8 Gerrit-Change-Number: 9795 Gerrit-PatchSet: 4 Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]> Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]>
