On 09/24/2010 05:55 PM, Lars Kellogg-Stedman wrote:
>> pacemaker is waiting for something in nanosleep.  Not sure what.
>
> Should I ping the pacemaker list separately?  I'm not sure how much
> overlap there is between here and there.
>
>>   The
>> symptom you describe sounds like a inability for corosync to form a
>> membership because of switch-default STP settings.
>
> I had a brief a-ha! moment: these systems are KVM guests.  Network
> connectivity is through bridges on the Linux host, which default to a
> 30 second forwarding delay.  Tragically, we had already set this to
> zero:
>
>    # brctl showstp br613 | grep -i delay
>    forward delay             0.00                 bridge forward delay       
> 0.00
>
> And in fact these sytems use DHCP to acquire network settings, and if
> the issue was STP this would prevent them from receiving a lease from
> the DHCP server.
>
>> Try running the following on the node after a lockup:
>> killall -SEGV corosync
>> corosync-fplay
>> attach output
>
> I've attached the output to this message.

 From the fplay records, it looks like corosync has started up perfectly 
and acquired all nodes in the network (248,249,250).  I would suggest 
pinging the pacemaker list for further investigation.

Regards
-steve
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to