I replaced the strncmp() calls in the stonithd.c function for matching
the node_name to the device hosts controlled list with the
case-insensitive version strncasecmp() and it's working like a champ
now.

Are the node names case sensitive or insensitive? If they are
insensitive then it might be a good idea to do all node name
comparisons with the strncasecmp() call instead just to thwart any
future cse issues. :)


On Thu, Mar 4, 2010 at 4:02 AM, Andreas Kurz <[email protected]> wrote:
> On Wednesday 03 March 2010 20:40:18 Brian Wolfe wrote:
>> I have a cluster setup with 2 dell servers, dual ethernet heartbeats,
>> and a single 8-port APCMaster PDU switch.  The cluster works except
>> for one issue. The cloned stonithd interface refuses to make a call to
>> the apcmaster to power down the node that's "dead". Reading through
>> the logs I can see that during setup the stonithd asks the
>> apcmastersnmp module to check it's hosts list and it returns the
>> correct hostnames  "tpc-dal-prlores3 tpc-dal-tcfs2". However when the
>> time comes for it to actually use the device I get the following
>> message from stonithd refusing to actually kill the other node.
>
> hmm .... the outlet names of the PDU are also uppercase?
>
> Regards,
> Andreas
>
>>
>> Mar  3 13:00:36 TPC-DAL-TCFS2 crmd: [15805]: info: te_fence_node:
>> Executing poweroff fencing operation (24) on TPC-DAL-PRLORES3
>> (timeout=60000)
>> Mar  3 13:00:36 TPC-DAL-TCFS2 crmd: [15805]: debug: waiting for the
>> stonith reply msg.
>> Mar  3 13:00:36 TPC-DAL-TCFS2 stonithd: [15800]: info: client tengine
>> [pid: 15805] requests a STONITH operation POWEROFF on node
>> TPC-DAL-PRLORES3
>> Mar  3 13:00:36 TPC-DAL-TCFS2 stonithd: [15800]: info: we can't manage
>> TPC-DAL-PRLORES3, broadcast request to other nodes
>> Mar  3 13:00:36 TPC-DAL-TCFS2 stonithd: [15800]: debug: inserted
>> optype=POWEROFF, key=-2
>> Mar  3 13:00:36 TPC-DAL-TCFS2 stonithd: [15800]: info: Broadcasting
>> the message succeeded: require others to stonith node
>> TPC-DAL-PRLORES3.
>> Mar  3 13:00:36 TPC-DAL-TCFS2 stonithd: [15800]: debug:
>> stonithd_node_fence: sent back a synchronous reply.
>> Mar  3 13:00:36 TPC-DAL-TCFS2 crmd: [15805]: debug:
>> stonithd_node_fence:574: stonithd's synchronous answer is ST_APIOK
>>
>>
>> The stonith is configured as follows:
>>
>>     <clone id="fencing" >
>>         <primitive class="stonith" id="apcstonith23" type="apcmastersnmp" >
>>         <operations id="apcstonith23-operations" >
>>           <op id="apcstonith23-op-monitor-15" interval="15"
>> name="monitor" start-delay="15" timeout="15" />
>>          </operations>
>>  <instance_attributes id="apcstonith23-instance_attributes" >
>>  <nvpair id="nvpair-604e339f-a400-4b30-82c0-f046de0ed663"
>> name="ipaddr" value="172.20.1.23" />
>> <nvpair id="nvpair-ed611421-97a1-4091-a5cd-8159f1230096" name="port"
>> value="161" />
>>  <nvpair id="nvpair-997431e2-ea78-4065-b835-f9149bbcb596"
>> name="community" value="private" />
>>  </instance_attributes>
>> </primitive>
>>  <meta_attributes id="fencing-meta_attributes" >
>>   </meta_attributes>
>> </clone>
>>
>>
>> I can confirm the use of the stonith via the command "stonith -t
>> apcmastersnmp <params> tpc-dal-prlores3" and it'll switch off the
>> server.
>>
>> Any help would be appreciated.
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to