Adar Dembo has posted comments on this change. Change subject: KUDU-2149: avoid election stacking by restoring failure monitor semantics ......................................................................
Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/8107/3/src/kudu/consensus/raft_consensus.h File src/kudu/consensus/raft_consensus.h: PS3, Line 786: Note: the lock is only ever acquired via try_lock(); if it cannot be : // acquired, an election must be in progress so the next one is skipped. > Yes, the new version of updated comment in PS4 is clearer, thanks. Yes. I think the cleanest way to do that is to modify the failure detector to be "one-shot", reenabling it when an election finishes. But I'm not sure of that approach yet because failure detection is only one of several paths into StartElection(), so I went with something simpler here. http://gerrit.cloudera.org:8080/#/c/8107/4/src/kudu/integration-tests/raft_consensus-itest.cc File src/kudu/integration-tests/raft_consensus-itest.cc: Line 3244: // Drive up the number of elections by reducing the failure period. > If it makes sense for this scenario, consider adding --raft_enable_pre_elec Disabling pre-elections also cuts the number of elections (and thus votes) down by more than half, so I'd prefer to keep them enabled. Line 3245: "--raft_heartbeat_interval_ms=100" > Would it make sense to decrease the default --leader_failure_max_missed_hea I don't think it's strictly necessary. 300 ms to detect a failure is reasonable, and I'd like to avoid unnecessary volatility from due to slow/missed heartbeats. -- To view, visit http://gerrit.cloudera.org:8080/8107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifeaf99ce57f7d5cd01a6c786c178567a98438ced Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes