[ https://issues.apache.org/jira/browse/KAFKA-18345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
José Armando García Sancio updated KAFKA-18345: ----------------------------------------------- Priority: Critical (was: Minor) > Investigate if binaryExponentialElectionBackoffMs can be removed or improved > ---------------------------------------------------------------------------- > > Key: KAFKA-18345 > URL: https://issues.apache.org/jira/browse/KAFKA-18345 > Project: Kafka > Issue Type: Bug > Components: kraft > Reporter: Alyssa Huang > Assignee: 黃竣陽 > Priority: Critical > > Right now this exponential backoff method can return a min backoff of 200ms > and cap out at the electionBackoffMaxMs (1s default). The logic for > calculating this exponential backoff isn't very straightforward, and probably > no longer needed now that Candidate transitions to Prospective after election > loss. We can simplify states by not tracking "retries" as well. > We _do_ still need a bit of randomness to ensure we can get past gridlocked > elections, but as Diego's Raft thesis shows, 50ms of random range is > sufficient to avoid repeat split votes in elections. > > Using more randomness improves worst-case behavior: with a 50 ms random > > range, the worst-case completion time (over 1,000 trials) was 513 ms. > We should investigate if it would make sense to replace the exponential > backoff with a non-exponential backoff that is more straightforward. > e.g. > {code:java} > int randomElectionBackoffMs() { > long backoff = random(0, 50ms) > return Math.min(state.remainingElectionTimeoutMs, backoff) > }{code} > Note, it's fine that the max is set to state.remainingElectionTimeoutMs > because that value is initialized with randomness in it (up to another > quorumConfig.electionTimeoutMs). > The alternative could also be using a lower default electionTimeoutMs (like > 300ms mentioned in the Raft paper) and on election failure, waiting the > extent of state.remainingElectionTimeoutMs (which contains some jitter). -- This message was sent by Atlassian Jira (v8.20.10#820010)