Hi Han:
Thanks for your kindly reply ! I have tried your patch and the candidate
problem is fixed on my env. Now my 3 nodes raft env works well.
Another question: I found you have submit ovsdb-cluster testsuite also, how
could I run these tests on my own setup ?
Thanks
Timo
------------------------------------------------------------------
On Sat, Mar 7, 2020 at 2:33 AM txfh2007 <[email protected]> wrote:
>
> Hi Han:
>
> Thanks for your reply ! There is one point that I can't agree with you:
> "If S2 or S3 already becomes leader, their term won't be lower than S2. " In
> my test , in step 3, S3 is leader and its term is lower than S2. The reason
> is when S2 disconnected from S1 and S3, S2 will add its term and send vote
> req until its connection recovered. At the same time ,S3 becomes leader and
> won't add its term. So it is possible that S2's term is larger than S3's,
> and that's why in Step 3, S2 replies "stale term" to S3's append entry
> request.
Hi Timo,
Sorry that my answer wasn't accurate enough and caused confusion. My answer was
focusing on the "candidate forever" scenario as you reported so I didn't take
the more common scenario (that a reconnected server can have larger term) into
account, but of course the more common scenario do exist. Please see my
rephrased answer below. and let me know if it solves the confusion.
Thanks,
Han
>
> Timo
>
>
> On Fri, Mar 6, 2020 at 1:13 AM txfh2007 via discuss
> <[email protected]> wrote:
> >
> > Hi Han && all:
> >
> > I have a question about RAFT: I have tried the latest OVN-2.30, and
> > have found in some condition, there is one node whose role is always
> > "Candidate" (got by cluster/status cmd), but act as a Follower. My cluster
> > still works well, but it seems odd that a server's role is always
> > Candidate. As far as I know, server's role is normally Follower or Leader.
>
>
> Hi Timo, I happened to fix the problem yesterday and here is the patch:
> https://patchwork.ozlabs.org/patch/1250116/. Details of my analysis is in
> commit message and a test case is added to cover this scenario.
>
>
>
> > After digging into related code, I think I can try to describe how to
> > reproduce this scenario:
> > 1. It is three servers cluster: One Leader(S2), Two followers(S1,S3)
> > 2. Try to disconnect Leader(S2) from other two servers,so S2 would
> > add term and send vote request, and meanwhile S1 and S3 would choose a new
> > Leader(Let's say it's S3)
>
>
> When S1 and S3 choose a new leader, they (one of them, or both) would have to
> increase the term, too.
>
>
>
> > 3. Recover connection between S2 and other two nodes, then if S2
> > receives append entry req from S3, as S3's term is lower, so S2 will reply
> > "stale term"
>
>
> If S2 or S3 already becomes leader, their term won't be lower than S2. From
> this point on, the below steps shouldn't happen. But instead, it is possible
> that when S2 receives append-request from the new leader, it has the same
> term, and it updates the leader without switching from candidate to follower,
> thus result in the candidate state forever.
>
Rephrase:
If S1 (not S2, sorry for the typo above) or S3 already becomes leader, it is
possible that their term is the same as the one of S2 when S2's connection
restored, and when S2 received append-request from the new leader, because it
observes the same term, it updates the leader without switching from candidate
to follower (which is a bug of the implementation, and fixed in the patch I
posted, which is merged yesterday), thus result in the candidate state forever.
In this situation, the candidate doesn't increase term and initiate
vote-request any more because it receives append-request (heartbeat) regularly
and responses, like a follower. The only difference is that it announces itself
as "disconnected from cluster" to its clients, so all the clients will be
disconnected from it.
On the other hand, if S2's connection is restored after more election timer
timeouts, it's term can be larger than the new leader. In this case, it won't
trigger the "candidate forever" problem. Firstly, the candidate will send
vote-request with a larger term, but the new leader will reject vote-request
because it is leader itself, and the follower will also reject the vote-request
because of the logic of "raft_should_suppress_disruptive_server()". However,
the candidate will receive append-request from the new leader, which has
smaller term. It replies append-reply with reason "stale term" but with the its
own term number. When the leader receives this reply, it sees a large term
number than its own, so it updates its term to the larger term and steps down
as follower, and then the cluster will start election again, which will end up
with one leader and two followers as usual.
>
>
> > 4. After S3 gets S2's reply, S3 will change its term to S2's value
> > and change its role to follower and then candidate(at the same time ,
> > S1/S2/S3 are all candidate role)
> > 5.Then if S2 got S3's vote request and vote for S3, S3 will become
> > new leader, but S2's role is still candidate
If all 3 ended up as candidate in same term as mention in your step 4, each of
them only votes to themselves, and there won't be any leader elected in that
term and they will have to increase term (at random time) and re-elect again.
For my understanding the only chance that end up with a candidate forever, is
when 2 servers entered into candidate competing in the *same term*.
> >
> > I guess The reason is term of S3's vote request is equal to S2's term,
> > For S2, it will change to follower only if receiving vote request whose
> > term value is larger than it own .
> > Am I right? and the candidate role(but actually is a follower) is
> > reasonable ?
> >
> > Thanks
> > Timo
> >
> Hi Timo,
>
>
>
>
>
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss