On 03/14/2011 06:05 PM, AP wrote:
> Hi,
> 
> Just had severe network flakyness here and found corosync vanishing from
> the process list on one of nodes. Initially this was due to packet loss
> but just now it was due to multicast not being enabled properly so that
> the node in question could send multicast packets but not receive them.
> 
> Attached is the corosync-fplay output as well as a bt full of the core
> file. The OS is Debian squeeze (libc 2.11.2), kernel 2.6.37.2.
> 
> AP
> 
> 
> 
> _______________________________________________
> Openais mailing list
> [email protected]
> https://lists.linux-foundation.org/mailman/listinfo/openais

This bug is fixed in commit:

commit 96fa74175b0efad6909bfff91f5948f4e8080768
Author: Steven Dake <[email protected]>
Date:   Fri Mar 4 12:55:54 2011 -0700

    Fix abort when token is lost in RECOVERY state

    A commit token should be rejected when a token is lost in the recovery
    state.  This occurs naturally because the ring id increases by 4 for
    every new ring.  Prior to this patch, if the token was lost, the old
    ring id information was restored, causing a commit token to be accepted
    when it should be rejected.  This erronously accepted commit token would
    lead to an assertion which is fixed by this patch.

    Signed-off-by: Steven Dake <[email protected]>
    Reviewed-by: Angus Salkeld <[email protected]>
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to