The question is if EVS detects rejoin of nodes? For example the "loosing side" in the merge could receive EVS_ERR_BAD_HANDLE and be forced to re-connect. This can only be the case if EVS maintains some form of history of previous ring formations. If EVS drops all information about membership of earlier rings, then of course it can not help in arbitrating the merger problem.
Arne > -----Original Message----- > From: Robert Wipfel [mailto:[EMAIL PROTECTED] > Sent: den 9 september 2008 14:12 > To: Arne Eriksson R; [EMAIL PROTECTED] > Subject: Re: [Openais] Split brain when using EVS library > > >>> On 9/9/2008 at 4:27 AM, in message > <[EMAIL PROTECTED] csson.se>, "Arne > Eriksson R" <[EMAIL PROTECTED]> wrote: > > Hi, > > We have a cluster with 6 processors using openais stable > version 0.80.3. > > > > For some reason our cluster splits up into two rings. > > Scenario is: > > node1(n1) n2 n3 n4 n5 n6 are in the ring. > > > > Suddenly the ring splits into two rings: > > n1 n2 n3 got leave msg from n4 n5 n6 > > n4 n5 n6 got leave msg from n1 n2 n3 > > > > After a few milliseconds the two rings joins again: > > n1 n2 n3 got join msg from n4 n5 n6 > > n4 n5 n6 got join msg from n1 n2 n3 > > > > The two ring is joined to one ring again: > > node1(n1) n2 n3 n4 n5 n6 are in the ring. > > > > The question is if this is a normal scenario from EVS in the openais > > implementation? > > > > The problem is that the application needs to detect the difference > > between two kinds of joins: The "normal" join where the two > rings/nodes > > join for the first time and the "abnormal" joins where a > ring has split > > and re-joined (without any nodes being restarted). The first case > > typically requires only a sync of some nodes (bringing the > history up to > > date). The second case requires a merger, i.e selection of a loosing > > side and the looser discarding the loosers history. > > Sidebar: if assuming the presence of a shared disk someplace, then it > can be used as a different kind of communication channel; for > detecting > Split Brain conditions: > http://wiki.linux-ha.org/SBD_Fencing > The idea is for the partitions to share membership > information / detect > that a partition exists. Just a thought - hopefully nothing > bad happened > while the partitions were split - in the second case ;-) > > Hth, > Robert > > _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
