Hi, On Tue, Mar 15, 2011 at 07:46:30AM -0700, Steven Dake wrote: > On 03/14/2011 06:05 PM, AP wrote: > > Hi, > > > > Just had severe network flakyness here and found corosync vanishing from > > the process list on one of nodes. Initially this was due to packet loss > > but just now it was due to multicast not being enabled properly so that > > the node in question could send multicast packets but not receive them. > > > > Attached is the corosync-fplay output as well as a bt full of the core > > file. The OS is Debian squeeze (libc 2.11.2), kernel 2.6.37.2. > > > > AP > > > > > > > > _______________________________________________ > > Openais mailing list > > [email protected] > > https://lists.linux-foundation.org/mailman/listinfo/openais > > This bug is fixed in commit:
The assert looks like this one: http://marc.info/?l=openais&m=129647667713161&w=2 Or is it that this patch fixes that one too? Thanks, Dejan > commit 96fa74175b0efad6909bfff91f5948f4e8080768 > Author: Steven Dake <[email protected]> > Date: Fri Mar 4 12:55:54 2011 -0700 > > Fix abort when token is lost in RECOVERY state > > A commit token should be rejected when a token is lost in the recovery > state. This occurs naturally because the ring id increases by 4 for > every new ring. Prior to this patch, if the token was lost, the old > ring id information was restored, causing a commit token to be accepted > when it should be rejected. This erronously accepted commit token would > lead to an assertion which is fixed by this patch. > > Signed-off-by: Steven Dake <[email protected]> > Reviewed-by: Angus Salkeld <[email protected]> > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
