Hello Mike Percy, Jean-Daniel Cryans, Todd Lipcon,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/9795
to look at the new patch set (#3).
Change subject: [linked_list-test] fix flake with the 3-4-3 scheme
......................................................................
[linked_list-test] fix flake with the 3-4-3 scheme
The LinkedListTest.TestLoadWhileOneServerDownAndVerify scenario became
flaky when running with --seconds_to_run set to about 800 and more
once the 3-4-3 replica management scheme became the default one.
Those cases (--seconds_to_run=X, X >= 800) are special because the
stopped replica falls behind WAL segment GC threshold when running with
the linked list input data, so the system automatically replaces
the failed replica. In case of the 3-4-3 scheme, the newly added
replica is added as a non-voter. The WaitForServersToAgree() looks
only at the OpId indices, not distinguishing between voter and
non-voter replicas. However, the verification phase of the scenario
assumes the only replica left alive is a voter replica.
Prior to this fix, the scenario didn't account for the case when the
replica at the restarted tablet server was still a non-voter, and in
most cases the rest 2 out of 3 tservers were shutdown before the newly
added replica was promoted. As a result, the latter replica was left
non-voter and the written data could not be read back from it.
This patch adds a step to verify that all 3 replicas are registered
as voters with the master(s) before shutting down the tservers hosting
the source 2 replicas. The scenario is now stable when running with
--seconds_to_run=900.
Change-Id: I132206371e2935f1e0f39e9eacad866fde22c5b8
---
M src/kudu/integration-tests/linked_list-test.cc
1 file changed, 31 insertions(+), 14 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/95/9795/3
--
To view, visit http://gerrit.cloudera.org:8080/9795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I132206371e2935f1e0f39e9eacad866fde22c5b8
Gerrit-Change-Number: 9795
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]>
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>