[Linux-HA] Regarding split brain

Preeti Jain Thu, 09 Dec 2010 19:36:36 -0800

Hello list,
 I am testing network failure case by removing nic cable on one node and 
getting 
unwanted outcomes as whole cluster gets disturbed and resource appears to move 
on different nodes until it gets stabled on one node and it is also resulting 
in 
failback.
Like if i remove nic cable from node 1 then failover happens it takes some time 
to move to node 2 but when once again i plugin cable on node 1 a kind of split 
brain happens and resource take sometime to get stabled on node 1 resulting 
failback which is again not desired as it should stay on node 2...
Every node says like other cluster nodes coming after partition


part of log file on node 1 after nic plugin
heartbeat[2521]: 2010/12/08_16:50:02 CRIT: Cluster node Node2 returning after 
partition.
heartbeat[2521]: 2010/12/08_16:50:02 info: For information on cluster 
partitions, See URL: http://linux-ha.org/SplitBrain
heartbeat[2521]: 2010/12/08_16:50:02 WARN: Deadtime value may be too small.
heartbeat[2521]: 2010/12/08_16:50:02 info: See FAQ for information on tuning 
deadtime.
heartbeat[2521]: 2010/12/08_16:50:02 info: URL: http://linux-
ha.org/FAQ#heavy_load
heartbeat[2521]: 2010/12/08_16:50:02 info: Link Node2:eth0 up.
heartbeat[2521]: 2010/12/08_16:50:02 WARN: Late heartbeat: Node Node2: interval 
781870 ms
heartbeat[2521]: 2010/12/08_16:50:02 info: Status update for node Node2: status 
active
heartbeat[2521]: 2010/12/08_16:50:03 info: Link Node3:eth0 up.
heartbeat[2521]: 2010/12/08_16:50:03 info: Link Node4:eth0 up.
heartbeat[2521]: 2010/12/08_16:50:03 CRIT: Cluster node Node4 returning after 
partition.
heartbeat[2521]: 2010/12/08_16:50:03 info: For information on cluster 
partitions, See URL: http://linux-ha.org/SplitBrain
heartbeat[2521]: 2010/12/08_16:50:03 WARN: Deadtime value may be too small.
heartbeat[2521]: 2010/12/08_16:50:03 info: See FAQ for information on tuning 
deadtime.
heartbeat[2521]: 2010/12/08_16:50:03 info: URL: http://linux-
ha.org/FAQ#heavy_load
heartbeat[2521]: 2010/12/08_16:50:03 WARN: Late heartbeat: node Node4: interval 
782200 ms
heartbeat[2521]: 2010/12/08_16:50:03 info: Status update for node Node4: status 
active
heartbeat[2521]: 2010/12/08_16:50:03 info: Link Node5:eth0 up.
heartbeat[2521]: 2010/12/08_16:50:04 CRIT: Cluster node Node2 returning after 
partition.
heartbeat[2521]: 2010/12/08_16:50:04 info: For information on cluster 
partitions, See URL: http://linux-ha.org/SplitBrain
heartbeat[2521]: 2010/12/08_16:50:04 WARN: Deadtime value may be too small.
heartbeat[2521]: 2010/12/08_16:50:04 info: See FAQ for information on tuning 
deadtime.
heartbeat[2521]: 2010/12/08_16:50:04 info: URL: http://linux-
ha.org/FAQ#heavy_load
heartbeat[2521]: 2010/12/08_16:50:04 WARN: Late heartbeat: node Node2: interval 
784380 ms
heartbeat[2521]: 2010/12/08_16:50:04 info: Status update for node Node2: status 
active
heartbeat[2521]: 2010/12/08_16:50:04 CRIT: Cluster node Node5 returning after 
partition.
heartbeat[2521]: 2010/12/08_16:50:04 info: For information on cluster 
partitions, See URL: http://linux-ha.org/SplitBrain
heartbeat[2521]: 2010/12/08_16:50:04 WARN: Deadtime value may be too small.
heartbeat[2521]: 2010/12/08_16:50:04 info: See FAQ for information on tuning 
deadtime.
heartbeat[2521]: 2010/12/08_16:50:04 info: URL: http://linux-
ha.org/FAQ#heavy_load
heartbeat[2521]: 2010/12/08_16:50:04 WARN: Late heartbeat: node Node5: interval 
784390 ms
heartbeat[2521]: 2010/12/08_16:50:04 info: Status update for node Node5: status 
active



Any solution for this problem...

Regards 
Preeti

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Regarding split brain

Reply via email to