[
https://issues.apache.org/jira/browse/RATIS-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tsz-wo Sze resolved RATIS-1265.
-------------------------------
Resolution: Not A Problem
After RATIS-1247, this seems no longer a problem. Please feel free to reopen
if necessary.
> Fix leader election with priority too slow
> ------------------------------------------
>
> Key: RATIS-1265
> URL: https://issues.apache.org/jira/browse/RATIS-1265
> Project: Ratis
> Issue Type: Sub-task
> Reporter: runzhiwang
> Assignee: runzhiwang
> Priority: Major
> Attachments: leader_election_slow
>
>
> As the attached log shows, there are 3 servers: s0, s1, s2, and s2 is the
> leader, then we change s0 with the highest priority, so s2 will
> yieldLeaderToHigherPriorityPeer(s0) when s0's log catch up. In
> yieldLeaderToHigherPriorityPeer, s2 will step down.
> But when s2 step down, which server will request vote is almost random, if
> s0 can not request vote in a short time, the leader election will last a long
> time.
> As the attached log shows, election happen 8 times and last 14 seconds, but
> s0 only try start leader election at the 6th time, and can not get the
> leadership.
> {code:java}
> 2020-12-25 10:11:34,995 s1: start s1@group-241716F733F8-LeaderElection2
> fail because s0 reject
> 2020-12-25 10:11:37,228 s2: start s2@group-241716F733F8-LeaderElection3
> fail because s0 reject
> 2020-12-25 10:11:39,345 s1: start s1@group-241716F733F8-LeaderElection4
> fail because s0 reject
> 2020-12-25 10:11:41,600 s1: start s1@group-241716F733F8-LeaderElection5
> fail because s0 reject
> 2020-12-25 10:11:43,710 s2: start s2@group-241716F733F8-LeaderElection6
> fail because s0 reject
> 2020-12-25 10:11:46,248 s0: start s0@group-241716F733F8-LeaderElection7
> fail because s1 start election after 200ms, s1's request vote arrives
> s2 before s0, so s1 voted for itself and rejected s0 at 2020-12-25
> 10:11:47,267, and s2 voted for s1 at 2020-12-25 10:11:46,469 and rejected s0
> at 2020-12-25 10:11:47,267
> 2020-12-25 10:11:46,461 s1: start s1@group-241716F733F8-LeaderElection8
> fail because s0 reject
> 2020-12-25 10:11:48,597 s2: start s2@group-241716F733F8-LeaderElection9
> fail because s0 reject
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)