Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14260 )
Change subject: [consensus] KUDU-2947 fix voting in case of slow WAL ...................................................................... Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/14260/1/src/kudu/consensus/raft_consensus.cc File src/kudu/consensus/raft_consensus.cc: http://gerrit.cloudera.org:8080/#/c/14260/1/src/kudu/consensus/raft_consensus.cc@1437 PS1, Line 1437: : // Also prohibit voting for anyone for the minimum election timeout. : withhold_votes_until_ = MonoTime::Now() + MinimumElectionTimeout(); > Could you decompose this pattern into a function with a descriptive name? Done http://gerrit.cloudera.org:8080/#/c/14260/1/src/kudu/integration-tests/raft_consensus_election-itest.cc File src/kudu/integration-tests/raft_consensus_election-itest.cc: http://gerrit.cloudera.org:8080/#/c/14260/1/src/kudu/integration-tests/raft_consensus_election-itest.cc@581 PS1, Line 581: const auto kMaxMissedHeartbeatPeriods = 3; > This is the default value. Yes, but I need it to use it as a variable in computations below. I thought it's easier to consistently set it to the de facto default value than extracting the default value from tserver/masters FLAGS_raft_hearbeat_interval_ms. I'm open to other ideas on how to refer to that value here in the test. http://gerrit.cloudera.org:8080/#/c/14260/1/src/kudu/integration-tests/raft_consensus_election-itest.cc@595 PS1, Line 595: ASSERT_OK(GetLeaderReplicaWithRetries(tablet_id_, &leader)); > Leadership can fluctuate after this call completes. How will the test respo That's a good point. In this test after one of the tablet servers is shutdown, the leadership should _not_ fluctuate because: 1) only two voters are alive 2) current leader should not be disturbed by rogue votes The leadership can fluctuate only if loosing Raft heartbeats, but I assume on a loopback interface that should not be the case. I updated the code and added corresponding comment. http://gerrit.cloudera.org:8080/#/c/14260/1/src/kudu/integration-tests/raft_consensus_election-itest.cc@614 PS1, Line 614: if (leader_tserver_idx == idx) { : ASSERT_OK(cluster_->SetFlag(ts, "log_inject_latency_ms_mean", : std::to_string(2 * kHeartbeatIntervalMs * kMaxMissedHeartbeatPeriods))); : ASSERT_OK(cluster_->SetFlag(ts, "log_inject_latency_ms_stddev", "0")); : ASSERT_OK(cluster_->SetFlag(ts, "log_inject_latency", "true")); : } else { : ASSERT_OK(cluster_->SetFlag(ts, "log_inject_latency_ms_mean", : std::to_string(kHeartbeatIntervalMs * kMaxMissedHeartbeatPeriods))); : ASSERT_OK(cluster_->SetFlag(ts, "log_inject_latency_ms_stddev", "0")); : ASSERT_OK(cluster_->SetFlag(ts, "log_inject_latency", "true")); : } > This can be condensed somewhat: Done http://gerrit.cloudera.org:8080/#/c/14260/1/src/kudu/integration-tests/raft_consensus_election-itest.cc@664 PS1, Line 664: ASSERT_FALSE(s.ok()) << "vote unexpectedly granted while leader is alive"; > Can we assert on a particular status code? Done -- To view, visit http://gerrit.cloudera.org:8080/14260 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c061b498e727a1a11e94e03c55530eeebfdf8dd Gerrit-Change-Number: 14260 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Volodymyr Verovkin <[email protected]> Gerrit-Comment-Date: Thu, 19 Sep 2019 23:07:13 +0000 Gerrit-HasComments: Yes
