Hi Ben,

The issues are sporadic. I got prometheus ovn_exporter running to capture 
relevant RAFT metrics. Once I see them reappearing, I will post log entries 
around the time when a cluster is in Leader, Candidate, Follower states. 
(During normal operation, it is Leader, and 2 Followers).

Best Regards,
Paul Greenberg

________________________________
From: Ben Pfaff <[email protected]>
Sent: Monday, November 5, 2018 4:21 PM
To: Paul Greenberg
Cc: Yifeng Sun; ovs dev
Subject: Re: [ovs-dev] [PATCH] ovsdb: Clarify that a server that leaves a 
cluster may never rejoin.

On Fri, Nov 02, 2018 at 07:08:34PM +0000, Paul Greenberg wrote:
> Let me clarify. Based on my observation, once a server loses touch
> with the rest of the cluster you have to rejoin it.
> At the same time it is not readily apparent that you have to do the
> clean up (removal) yourself. For example, you had a cluster of 3
> nodes, then after some time one node is out. You go to that node and
> do a cluster join. My understanding is that a new server id gets
> generated.
>
> Now, if you did not do a cleanup, you end up with 4 node cluster. When 3 out 
> of 4 are working, there is no issue. It is almost mandatory to do a cleanup 
> after join.

The design intent is that, if a server goes out of contact with the rest
of the cluster, and later comes back into contact, then it gracefully
catches up and becomes a productive member of the cluster.

It seems like you're encountering bugs that I don't understand yet. Can
you help me to reproduce them?
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to