Re: [Openais] [whitetank / corosync trunk] Fix checkpoint sync in certain scenarios

Steven Dake Fri, 07 Nov 2008 13:47:41 -0800

it is the starting not the exiting that is at issue.

specifically when two nodes are synchronizing and a lower IP addressed
machine starts up, it triggers an abort in the other synchronization
process and then those nodes completely fail to synchronize.


So I think it probably effects you if you use the checkpoint service.

Regards
-steve

On Fri, 2008-11-07 at 14:48 +0100, Andrew Beekhof wrote:
> On Fri, Nov 7, 2008 at 10:24, Steven Dake <[EMAIL PROTECTED]> wrote:
> > In a certain rare scenario, the checkpoint service throws away the
> > current checkpoint database.
> >
> > An example of when this occurs is when there are 3 nodes A, B, C, node A
> > and C are killed
> 
> Does it have to be killed, or could shutdown trigger this too?
> 
> > then node B syncs.  After this completes, Node C is
> > started and node B again begins resyncing, but during this sync process
> > node A starts up.
> >
> > This results in node b no longer believing it is required to sync its
> > current database contents.  The abort called on node b throws away all
> > checkpoints in the system but since node b is no longer the lowest node
> > id in the system it believes it doesn't have to sync.
> >
> > The design change is that once a node has been declared as a responsible
> > for synchronization, any aborts or configuration changes will never
> > change the fact that node is still responsible for synchronization.
> >
> > Regards
> > -steve
> >
> > _______________________________________________
> > Openais mailing list
> > [email protected]
> > https://lists.linux-foundation.org/mailman/listinfo/openais
> >

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Re: [Openais] [whitetank / corosync trunk] Fix checkpoint sync in certain scenarios

Reply via email to