Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11918 )

Change subject: raft_consensus_nonvoter-itest: deflake a bit
......................................................................

raft_consensus_nonvoter-itest: deflake a bit

I saw a failure in ReplicaBehindWalGcThresholdITest.ReplicaReplacement
(GetParam() was (1, false)) just after the master was restarted:

  raft_consensus_nonvoter-itest.cc:2070: Failure
  Failed
  Bad status: Service unavailable: Leader not yet ready to serve requests

This is odd as there's a WaitForCatalogManager() call in there, so why would
a subsequent GetTabletLocations RPC return this ServiceUnavailable? As best
I can tell, the only way for this to happen is if the attempt to grab the
leadership lock from within the ListTables RPC (sent from
WaitForCatalogManager()) returns IllegalState, which it'll do if the
UUID in the master's cstate doesn't match the UUID on disk. Perhaps this can
happen during a leader master election; maybe the cstate's UUID becomes
empty for a little while? If that's true, this should fix the problem by
considering IllegalState to be a non-final state and continuing the loop.

I couldn't repro this failure, but Alexey managed to do so in a dist-test
loop with special latency injection enabled. Without the fix, 93 out of 256
runs failed, and with the fix, none failed.

Change-Id: I8192bd669e7e309943ea82718dd715238d520bbd
Reviewed-on: http://gerrit.cloudera.org:8080/11918
Reviewed-by: Alexey Serbin <aser...@cloudera.com>
Tested-by: Kudu Jenkins
---
M src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
3 files changed, 21 insertions(+), 8 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/11918
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8192bd669e7e309943ea82718dd715238d520bbd
Gerrit-Change-Number: 11918
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

Reply via email to