Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11918 )
Change subject: raft_consensus_nonvoter-itest: deflake a bit ...................................................................... raft_consensus_nonvoter-itest: deflake a bit I saw a failure in ReplicaBehindWalGcThresholdITest.ReplicaReplacement (GetParam() was (1, false)) just after the master was restarted: raft_consensus_nonvoter-itest.cc:2070: Failure Failed Bad status: Service unavailable: Leader not yet ready to serve requests This is odd as there's a WaitForCatalogManager() call in there, so why would a subsequent GetTabletLocations RPC return this ServiceUnavailable? As best I can tell, the only way for this to happen is if the attempt to grab the leadership lock from within the ListTables RPC (sent from WaitForCatalogManager()) returns IllegalState, which it'll do if the UUID in the master's cstate doesn't match the UUID on disk. Perhaps this can happen during a leader master election; maybe the cstate's UUID becomes empty for a little while? If that's true, this should fix the problem by considering IllegalState to be a non-final state and continuing the loop. I couldn't repro this failure, but Alexey managed to do so in a dist-test loop with special latency injection enabled. Without the fix, 93 out of 256 runs failed, and with the fix, none failed. Change-Id: I8192bd669e7e309943ea82718dd715238d520bbd Reviewed-on: http://gerrit.cloudera.org:8080/11918 Reviewed-by: Alexey Serbin <aser...@cloudera.com> Tested-by: Kudu Jenkins --- M src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/mini-cluster/external_mini_cluster.h 3 files changed, 21 insertions(+), 8 deletions(-) Approvals: Alexey Serbin: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/11918 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I8192bd669e7e309943ea82718dd715238d520bbd Gerrit-Change-Number: 11918 Gerrit-PatchSet: 3 Gerrit-Owner: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120)