Hi,
I have setup heartbeat to mount a shared filesystem to provide HA on
our HPC cluster. This was setup with version 2.x of heartbeat but
using a 1.x style config. If I understand right this is fine.
I then went on to setup STONITH with the external/ipmi script provided.
I am able to cycle the node using stonith from the cmd line.
stonith -t external/ipmi -F /etc/ha.d/ipmi-mds1.cfg -T reset
mds1.engin.umich.edu
But when I add it to my ha.cf on both nodes:
stonith_host mds1.engin.umich.edu external/ipmi /etc/ha.d/ipmi-mds1.cfg
The cluster never uses it. I see in the ha-log:
heartbeat[6832]: 2008/07/30_15:08:02 WARN: node mds1.engin.umich.edu:
is dead
heartbeat[6832]: 2008/07/30_15:08:02 WARN: No STONITH device configured.
heartbeat[6832]: 2008/07/30_15:08:02 WARN: Shared disks are not
protected.
I killed heartbeat by doing: killall -9 heartbeat on mds1. So
heartbeat sees the other heartbeat as dead (but the disk is still
mounted on mds1 and should shoot it). and I would think this would
be cause to fence the node but it does not.
How can I get heartbeat to realize that it can kill mds1 with
external/ipmi ?
I can't find any information that points out anyone else running into
this problem. Any help would be great.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems