Hi, This issue might be related to Bug #1546. http://developerbugs.linux-foundation.org//show_bug.cgi?id=1546
When Heartbeat recovers a split brain, It tries to handle its instance id. Some nodes receive an old id which has been gotten before, Almost all nodes can receive the newest instance id successfully. But, sometimes, some nodes can not. I created a new bugzilla and filed the logs. http://developerbugs.linux-foundation.org//show_bug.cgi?id=1991 9 nodes received "instance=17" during a split brain. hac01, hac02, hac03, hac04, hac06 hac08 and hac09 receieved it again after recovering a split brain, but they can receive the newset id (like instance=18, 20, 23...) and join the cluster member. hac02, hac06 received "instance=17" again, and can notice the DC election, but they freeze... the newest id doesn't come. Other nodes would take hac02 and hac06 as OFFLINE node. This situation is very rare, so is this some timing bug? Best Regards, Junko Ikeda NTT DATA INTELLILINK CORPORATION _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
