hi, i'm hoping somebody can help me get my STONITH device working properly. i'll try to be brief... :)

two node cluster (names beast and mystique, respectively), each node has a DRAC5 card. i've elected to use the external/ipmi STONITH agent. testing my configuration with the following STONITH CLI command reboots beast perfectly:

mystique$~: stonith -t external/ipmi -p "beast beast_drac hacluster supersecret" -T reset beast

however, i can't seem to get heartbeat to automagically STONITH beast when appropriate. for my test, i simply 'killall -9 heartbeat' on beast, and wait. my resources fail over just fine to mystique, but no STONITH for the beast :(

i've tried both version 1 and version 2 configurations to no avail. i'm attaching those here for reference.


VERSION 1:

// STONITH line in ha.cf
mystique$~: cat /etc/ha.d/ha.cf | grep ^stonith
stonith external/ipmi /etc/ha.d/drac5.cfg

// corresponding drac5.cfg file
mystique$~: cat /etc/ha.d/drac5.cfg
beast beast_drac hacluster supersecret

// perms on drac5.cfg
mystique$~: ls -al /etc/ha.d/drac5.cfg
-rw------- 1 root root 55 Aug 30 21:48 /etc/ha.d/drac5.cfg



VERSION 2:

// relevant resource
<primitive id="resource_stonith_beast" class="stonith" type="external/ipmi" provider="heartbeat">
     <meta_attributes id="resource_stonith_beast_meta_attrs">
       <attributes>
<nvpair id="resource_stonith_beast_metaattr_target_role" name="target_role" value="started"/>
       </attributes>
     </meta_attributes>
     <instance_attributes id="resource_stonith_beast_instance_attrs">
       <attributes>
<nvpair id="9cb48aa3-82c0-43df-b9a9-06cad731a67e" name="hostname" value="beast"/> <nvpair id="a5d8d0b8-f753-47ef-9381-736ed76822d0" name="ipaddr" value="beast_drac"/> <nvpair id="f3bb9451-003a-49b6-b12b-89cdff1a40a7" name="userid" value="hacluster"/> <nvpair id="f1954c51-f7af-474a-b63b-21aa22921c47" name="passwd" value="supersecret"/>
       </attributes>
     </instance_attributes>
     <operations>
<op id="223c6d56-f8ec-4f42-85ca-214775f0fc2c" name="monitor" interval="15" timeout="15" start_delay="15" prereq="nothing"/> <op id="f0f6e6fd-211b-4bf8-9bbc-3f1fabf7fcba" name="start" timeout="15" prereq="nothing"/>
     </operations>
   </primitive>

// force the resource to live on the right node
<rsc_location id="location_stonith_beast_on_mystique" rsc="resource_stonith_beast"> <rule id="prefered_location_stonith_beast_on_mystique" score="INFINITY"> <expression attribute="#uname" id="4fbbd7c1-4fc2-44ee- b9d4-1deccd505728" operation="eq" value="mystique"/>
     </rule>
   </rsc_location>


i'm finding these entries in ha-log, which seem to indicate that the daemon is running:

tengine[6803]: 2008/08/30_22:02:47 info: te_connect_stonith: Attempting connection to fencing daemon...
tengine[6803]: 2008/08/30_22:02:48 info: te_connect_stonith: Connected

at this point i'm at a loss, so any help/tips/advice would be greatly appreciated.

thanks!

chad
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to