Hi Steve,
It seems like the fix has connection with the potential bug report "[Openais]
RE: Need help to reduce the time wait of saRecvRetry()" which I called
several weeks before. When the node start to open or read a checkpoint it will
tried to resync the checkpoint so it cost a lot of times, is that right? So
does this diff fix the bugs? If so I will rebuild the test environment to see
whether the saRecvRetry() time delay while phy connection lost problem has gone.
Thanks.
Best
Rat> From: [EMAIL PROTECTED]> To: [EMAIL PROTECTED]> Date: Fri, 7 Nov 2008
14:43:47 -0700> CC: [EMAIL PROTECTED]> Subject: Re: [Openais] [whitetank /
corosync trunk] Fix checkpoint sync in certain scenarios> > it is the starting
not the exiting that is at issue.> > specifically when two nodes are
synchronizing and a lower IP addressed> machine starts up, it triggers an abort
in the other synchronization> process and then those nodes completely fail to
synchronize.> > So I think it probably effects you if you use the checkpoint
service.> > Regards> -steve> > On Fri, 2008-11-07 at 14:48 +0100, Andrew
Beekhof wrote:> > On Fri, Nov 7, 2008 at 10:24, Steven Dake <[EMAIL PROTECTED]>
wrote:> > > In a certain rare scenario, the checkpoint service throws away the>
> > current checkpoint database.> > >> > > An example of when this occurs is
when there are 3 nodes A, B, C, node A> > > and C are killed> > > > Does it
have to be killed, or could shutdown trigger this too?> > > > > the!
n node B syncs. After this completes, Node C is> > > started and node B again
begins resyncing, but during this sync process> > > node A starts up.> > >> > >
This results in node b no longer believing it is required to sync its> > >
current database contents. The abort called on node b throws away all> > >
checkpoints in the system but since node b is no longer the lowest node> > > id
in the system it believes it doesn't have to sync.> > >> > > The design change
is that once a node has been declared as a responsible> > > for
synchronization, any aborts or configuration changes will never> > > change the
fact that node is still responsible for synchronization.> > >> > > Regards> > >
-steve> > >> > > _______________________________________________> > > Openais
mailing list> > > [email protected]> > >
https://lists.linux-foundation.org/mailman/listinfo/openais> > >> >
_______________________________________________> Openais mailing list> [EMAIL
PROTECTED]
undation.org> https://lists.linux-foundation.org/mailman/listinfo/openais
_________________________________________________________________
Connect to the next generation of MSN Messenger
http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais