Hi, On Mon, Dec 08, 2008 at 10:02:49AM +0100, Adrian Schoene wrote: > Hi there, > > I have a SLES 10 SP2 based two node cluster. The cluster is stonith > enabled > and uses IPMI to kill a dead node. > > > Finally I am testing the cluster and the behavior of the cluster if a node > fails. > I used iptables to block the udp packages of a node. After a short time > the > node get stonithed and the alive node take over the ressources of the dead > node. > > I tested the same thing with plugging off the power cables - with success. > In my last test I forgot plug in the power cable and the failover failed > because > the alive node tries to reset / kill the dead node. > > stonithd[6810]: 2008/12/08_09:28:08 info: external_run_cmd: Calling > '/usr/lib64/stonith/plugins/external/ipmi off bdmz02' returned 256 > stonithd[6810]: 2008/12/08_09:28:08 CRIT: external_reset_req: 'ipmi off' > for host bdmz02 failed with rc 256 > stonithd[7151]: 2008/12/08_09:28:08 info: Failed to STONITH node bdmz02 > with one local device, exitcode = 5. Will try to use the next local > device. > stonithd[7151]: 2008/12/08_09:28:29 ERROR: Failed to STONITH the node > bdmz02: optype=POWEROFF, op_result=TIMEOUT > > After plugging in the cable (but not starting the server) the server > recognizes that the stonith of the server is back to life > and the cluster will start the failover. > > How can I manage or solve this problem because it can happen that one > server room loose the power unit > and therefore the server has no power.
You can't solve it. No power, no stonith device, no fencing, and the cluster will wait forever. Either get a UPS based fencing device, or make sure that there's power (if you can), or document a procedure for manual failover on power loss. Thanks, Dejan > Greetings, > Adrian > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
