On Mon, 2010-09-27 at 12:16 -0700, Robinson, Eric wrote:
> Not sure if you noticed in my previous message that I did physically
> power down the primary but the standby refused to take any action. 

Yes, I did notice that. My point is that I have noted on my clusters
that simply powering it down (i.e. having it suddenly go away) may not
be enough. That requires it to simply assume that the primary has gone
away, and that it's not just a cable or NIC failure. STONITH is a method
of *assuring* that the other node has gone away. It is designed to
prevent both nodes from trying to run the same resources, which can have
disastrous consequences. 

As I noted, I am not certain whether or not using STONITH is absolutely
required now, but I have observed the same symptoms as you, and I ended
up having to configure STONITH in order to get failovers to work
properly.

Usually though, if I explicitly set one node to standby, the other one
will take over, because they can exchange messages that will convince
the remaining node that the standby node will not be running any
resources. 

So I really don't know if STONITH is your problem or would fix your
problem. I only note that I have seen the same symptoms and that was how
I fixed it for my clusters.

--Greg



_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to