heartbeat 2.1.3_3 and drbd 8.0.8 (dopd and STONITH ip,i in use)

I successfully was able to test my 2 node cluster simply by powering
the nodes off and on in varying order and the HA resources
successfully moved in each case (hurray).
Now I went back to my original test of previous frustration.  I yanked
all the ethernet cables from the primary machine (both LAN and
crossover)

On the Secondary (unaffected) machine I see that STONITH tried to
shoot the other node for about 20 minutes before giving up.  Right now
my secomdary node says Secondary/Unknown and the Primary Node says
Primary/Unknown.

First off is there a configurable parameter for STONITH on how long it tries?

When I plug the network back into the Primary immediately rebooted
(not sure why) and when it came back up I was in split brain again.

So whenever you have 2 nodes in a cluster and all redundant
communication paths have been suffered by default then you will have a
Split Brain that needs to be manually corrected.  Am I understanding
this right?

I am not complaining I am just trying to determine what I am to expect
so I can write up procedures and what not.  The failover worked great
with other tests.

regards,

Doug



-- 
What profits a man if he gains the whole world yet loses his soul?
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to