Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/11918 )
Change subject: raft_consensus_nonvoter-itest: deflake a bit ...................................................................... Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/11918/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11918/1//COMMIT_MSG@21 PS1, Line 21: master's cstate doesn't match the UUID on disk As it turned out, the reason was master's cstate had no leader, i.e. (cstate.leader_uuid() != uuid) would yield true since that was comparing UUID of system tablet replica with empty string. http://gerrit.cloudera.org:8080/#/c/11918/1//COMMIT_MSG@22 PS1, Line 22: maybe the cstate's UUID becomes : empty for a little while > I took a look at the test logs. As far as I can see, that was ServiceUnava Additional information: since I was not able to repro the initial issue with over 1K runs, I injected random latency there: https://gerrit.cloudera.org/#/c/11931/ The 93 out of 256 runs failed, 3 with exact error message from ListTablets as from the original flake run: http://dist-test.cloudera.org//job?job_id=aserbin.1542234637.28507 After I applied the patch, not a single failure in 256 runs: http://dist-test.cloudera.org//job?job_id=aserbin.154223 5694.36306 -- To view, visit http://gerrit.cloudera.org:8080/11918 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8192bd669e7e309943ea82718dd715238d520bbd Gerrit-Change-Number: 11918 Gerrit-PatchSet: 1 Gerrit-Owner: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Wed, 14 Nov 2018 23:05:46 +0000 Gerrit-HasComments: Yes
