heartbeat 2.1.3_3 and drbd 8.0.8 (dopd and STONITH ip,i in use) I successfully was able to test my 2 node cluster simply by powering the nodes off and on in varying order and the HA resources successfully moved in each case (hurray). Now I went back to my original test of previous frustration. I yanked all the ethernet cables from the primary machine (both LAN and crossover)
On the Secondary (unaffected) machine I see that STONITH tried to shoot the other node for about 20 minutes before giving up. Right now my secomdary node says Secondary/Unknown and the Primary Node says Primary/Unknown. First off is there a configurable parameter for STONITH on how long it tries? When I plug the network back into the Primary immediately rebooted (not sure why) and when it came back up I was in split brain again. So whenever you have 2 nodes in a cluster and all redundant communication paths have been suffered by default then you will have a Split Brain that needs to be manually corrected. Am I understanding this right? I am not complaining I am just trying to determine what I am to expect so I can write up procedures and what not. The failover worked great with other tests. regards, Doug -- What profits a man if he gains the whole world yet loses his soul? _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
