> pacemaker is waiting for something in nanosleep. Not sure what. Should I ping the pacemaker list separately? I'm not sure how much overlap there is between here and there.
> The > symptom you describe sounds like a inability for corosync to form a > membership because of switch-default STP settings. I had a brief a-ha! moment: these systems are KVM guests. Network connectivity is through bridges on the Linux host, which default to a 30 second forwarding delay. Tragically, we had already set this to zero: # brctl showstp br613 | grep -i delay forward delay 0.00 bridge forward delay 0.00 And in fact these sytems use DHCP to acquire network settings, and if the issue was STP this would prevent them from receiving a lease from the DHCP server. > Try running the following on the node after a lockup: > killall -SEGV corosync > corosync-fplay > attach output I've attached the output to this message.
corosync.log
Description: Binary data
_______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
