[
https://issues.apache.org/jira/browse/KAFKA-18345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
黃竣陽 reassigned KAFKA-18345:
---------------------------
Assignee: 黃竣陽
> Investigate if binaryExponentialElectionBackoffMs can be removed or improved
> ----------------------------------------------------------------------------
>
> Key: KAFKA-18345
> URL: https://issues.apache.org/jira/browse/KAFKA-18345
> Project: Kafka
> Issue Type: Sub-task
> Reporter: Alyssa Huang
> Assignee: 黃竣陽
> Priority: Minor
>
> Right now this exponential backoff method can return a min backoff of 200ms
> and cap out at the electionBackoffMaxMs (1s default). The logic for
> calculating this exponential backoff isn't very straightforward, and probably
> no longer needed now that Candidate transitions to Prospective after election
> loss. We can simplify states by not tracking "retries" as well.
> We _do_ still need a bit of randomness to ensure we can get past gridlocked
> elections, but as Diego's Raft thesis shows, 50ms of random range is
> sufficient to avoid repeat split votes in elections.
> > Using more randomness improves worst-case behavior: with a 50 ms random
> > range, the worst-case completion time (over 1,000 trials) was 513 ms.
> We should investigate if it would make sense to replace the exponential
> backoff with a non-exponential backoff that is more straightforward.
> e.g.
> {code:java}
> int randomElectionBackoffMs() {
> long backoff = random(0, 50ms)
> return Math.min(state.remainingElectionTimeoutMs, backoff)
> }{code}
> Note, it's fine that the max is set to state.remainingElectionTimeoutMs
> because that value is initialized with randomness in it (up to another
> quorumConfig.electionTimeoutMs).
> The alternative could also be using a lower default electionTimeoutMs (like
> 300ms mentioned in the Raft paper) and on election failure, waiting the
> extent of state.remainingElectionTimeoutMs (which contains some jitter).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)