Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10887 )
Change subject: [raft_conensus-itest] fix TestElectionMetrics flake ...................................................................... Patch Set 1: (14 comments) http://gerrit.cloudera.org:8080/#/c/10887/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10887/1//COMMIT_MSG@7 PS1, Line 7: conensus consensus http://gerrit.cloudera.org:8080/#/c/10887/1//COMMIT_MSG@10 PS1, Line 10: The original TestElectionMetrics scenario was split into : two parts TestElectionMetricsPart[1,2] Why? http://gerrit.cloudera.org:8080/#/c/10887/1//COMMIT_MSG@14 PS1, Line 14: happen happened http://gerrit.cloudera.org:8080/#/c/10887/1/src/kudu/consensus/raft_consensus.cc File src/kudu/consensus/raft_consensus.cc: http://gerrit.cloudera.org:8080/#/c/10887/1/src/kudu/consensus/raft_consensus.cc@1321 PS1, Line 1321: Resetting Reset http://gerrit.cloudera.org:8080/#/c/10887/1/src/kudu/consensus/raft_consensus.cc@1321 PS1, Line 1321: since : // that's the now that we've accepted an http://gerrit.cloudera.org:8080/#/c/10887/1/src/kudu/consensus/raft_consensus.cc@1323 PS1, Line 1323: it might be : // a race between setting the leader UUID in the consensus meta and : // VOTE_DENIED response sent for a vote request sent earlier. If the : // VOTE_DENIED response processed _after_ the call to UpdateReplica(), : // the above-mentioned 'failed_elections_since_stable_leader' metric would : // stuck with non-zero value if not resetting it here as well. I'd rephrase this a bit to "because there is a potential race between resetting the failed elections count in SetLeaderUuidUnlocked() and incrementing after a failed election if another replica was elected leader in an election concurrent with the one called by this replica." http://gerrit.cloudera.org:8080/#/c/10887/1/src/kudu/integration-tests/raft_consensus-itest.cc File src/kudu/integration-tests/raft_consensus-itest.cc: http://gerrit.cloudera.org:8080/#/c/10887/1/src/kudu/integration-tests/raft_consensus-itest.cc@2673 PS1, Line 2673: the failed leader detection leader failure detection http://gerrit.cloudera.org:8080/#/c/10887/1/src/kudu/integration-tests/raft_consensus-itest.cc@2674 PS1, Line 2674: that's to avoid this avoids http://gerrit.cloudera.org:8080/#/c/10887/1/src/kudu/integration-tests/raft_consensus-itest.cc@2674 PS1, Line 2674: be racing race http://gerrit.cloudera.org:8080/#/c/10887/1/src/kudu/integration-tests/raft_consensus-itest.cc@2675 PS1, Line 2675: assertion of appropriate constraints on tests of http://gerrit.cloudera.org:8080/#/c/10887/1/src/kudu/integration-tests/raft_consensus-itest.cc@2676 PS1, Line 2676: Part1 TestElectionMetrics_FailureDetectionDisabled perhaps? http://gerrit.cloudera.org:8080/#/c/10887/1/src/kudu/integration-tests/raft_consensus-itest.cc@2745 PS1, Line 2745: the failed leader detection leader failure detection http://gerrit.cloudera.org:8080/#/c/10887/1/src/kudu/integration-tests/raft_consensus-itest.cc@2746 PS1, Line 2746: Part2 TestElectionMetrics_FailureDetectionEnabled perhaps? http://gerrit.cloudera.org:8080/#/c/10887/1/src/kudu/integration-tests/raft_consensus-itest.cc@2796 PS1, Line 2796: Start back Restart -- To view, visit http://gerrit.cloudera.org:8080/10887 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I073c9989a6d5d5dc1eb104120a89d38cfce2ac6e Gerrit-Change-Number: 10887 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Fri, 06 Jul 2018 23:31:48 +0000 Gerrit-HasComments: Yes
