Hi, On Wed, Feb 13, 2008 at 10:39:07PM -0700, Damon Estep wrote: > More details; > > Note:resource_cn2_stonith is the resource the shoots node cn2, resource > that should shoot cn1 is running on cn2 which is online. > > [EMAIL PROTECTED] ~]# crm_verify -LV > crm_verify[5162]: 2008/02/13_22:33:51 WARN: > determine_online_status_fencing: Node cn1 > (0048dc5f-f558-4ae4-a9bb-d0f62b0b4b5a) is un-expectedly down > crm_verify[5162]: 2008/02/13_22:33:51 WARN: determine_online_status: > Node cn1 is unclean > crm_verify[5162]: 2008/02/13_22:33:51 WARN: native_color: Resource > resource_cn2_stonith cannot run anywhere
It says here that the stonith resource doesn't run. Did you try using a clone stonith resource? That should be better configuration for this type of stonith device. Thanks, Dejan > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Action > resource_cn2_stonith_stop_0 on cn1 is unrunnable (offline) > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Marking node > cn1 unclean > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Action > resource_vg8_drbd_stop_0 on cn1 is unrunnable (offline) > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Marking node > cn1 unclean > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Action > resource_vg8_fs_stop_0 on cn1 is unrunnable (offline) > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Marking node > cn1 unclean > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Action > resource_vg8_ip_stop_0 on cn1 is unrunnable (offline) > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Marking node > cn1 unclean > crm_verify[5162]: 2008/02/13_22:33:51 WARN: stage6: Scheduling Node cn1 > for STONITH > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Action > resource_vg8_fs_stop_0 on cn1 is unrunnable (offline) > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Marking node > cn1 unclean > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Action > resource_vg8_ip_stop_0 on cn1 is unrunnable (offline) > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Marking node > cn1 unclean > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Action > resource_vg8_ip_stop_0 on cn1 is unrunnable (offline) > crm_verify[5162]: 2008/02/13_22:33:51 WARN: custom_action: Marking node > cn1 unclean > Warnings found during check: config may not be valid > [EMAIL PROTECTED] ~]# > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:linux-ha- > > [EMAIL PROTECTED] On Behalf Of Damon Estep > > Sent: Wednesday, February 13, 2008 3:35 PM > > To: [email protected] > > Subject: [Linux-HA] STONITH frustration... external/rackpdu > > > > I have a cluster with many nodes (12), all are connected to APC AP7900 > > rack PDU devices. > > > > > > > > A manually executed stonith command resets the outlet as expected as > > follows; > > > > > > > > # stonith -t external/rackpdu -T reset -p "rack_pdu_ip > > write_snmp_community outlet_number" nodename > > > > > > > > The stonith command requires a nodename, but it does not matter what I > > put there as the external plugin does not require it (seems odd). > > > > > > > > Heartbeat 2.1.3 is configured symmetric cluster = false, stonith > > enabled > > = true, resource stickiness = INFINITY, crm = yes. > > > > > > > > When I disable "stonith enabled" I get clean failovers when a node > > dies, > > but with stonith enabled I get a log entry on the DC that STONITH has > > been scheduled, but then nothing happesn, no STONITH, no failover, > just > > oprahned resources. > > > > > > > > I have created the stonith external/rackpu resource and created a > > constraint that makes it run on only one node (the node that is home > to > > the DRBD peer). The resource show running on the node that the > failover > > would normally go to when stonith is disabled, and the resource is set > > up to STONITH the node that the resource runs on normally. > > > > > > > > What further debugging can I do to determine why the STONITH gets > > scheduled but never executes? > > > > > > > > There are no entries in the syslog about a STONITH script failure. The > > script should execute snmpset and I have tested that the command as > > formatted by the script does execute and produce the desire results > > (when run as root); > > > > > > > > # snmpset -v 1 -c community _name pdu_hostname > > .1.3.6.1.4.1.318.1.1.12.3.3.1.1.4.1 i 3 > > > > > > > > Any suggestions? > > > > > > > > Thank you > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
