Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11918

to look at the new patch set (#2).

Change subject: raft_consensus_nonvoter-itest: deflake a bit
......................................................................

raft_consensus_nonvoter-itest: deflake a bit

I saw a failure in ReplicaBehindWalGcThresholdITest.ReplicaReplacement
(GetParam() was (1, false)) just after the master was restarted:

  raft_consensus_nonvoter-itest.cc:2070: Failure
  Failed
  Bad status: Service unavailable: Leader not yet ready to serve requests

This is odd as there's a WaitForCatalogManager() call in there, so why would
a subsequent GetTabletLocations RPC return this ServiceUnavailable? As best
I can tell, the only way for this to happen is if the attempt to grab the
leadership lock from within the ListTables RPC (sent from
WaitForCatalogManager()) returns IllegalState, which it'll do if the
UUID in the master's cstate doesn't match the UUID on disk. Perhaps this can
happen during a leader master election; maybe the cstate's UUID becomes
empty for a little while? If that's true, this should fix the problem by
considering IllegalState to be a non-final state and continuing the loop.

I couldn't repro this failure, but Alexey managed to do so in a dist-test
loop with special latency injection enabled. Without the fix, 93 out of 256
runs failed, and with the fix, none failed.

Change-Id: I8192bd669e7e309943ea82718dd715238d520bbd
---
M src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
3 files changed, 21 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/11918/2
--
To view, visit http://gerrit.cloudera.org:8080/11918
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8192bd669e7e309943ea82718dd715238d520bbd
Gerrit-Change-Number: 11918
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

Reply via email to