I'll have to check against the head next week for that and any other
node name comparison instances in the code base. Work just dumped
another set of servers in my lap today so life is getting to be busy.
I noticed that in most places whent he node name is mentioned it's in
all uppercase. Only the stonith interface presents them as lowercase
int he logs, so far as I have noticed at least. Interestingly enough
the stonith APCMaster device has the outlet labels in all uppercase.
So I don't know if it's the SNMP protocol changing the case or if it's
the apcmastersnmp.so that's doing it or what.

Mentioning lowercase hostnames in the documentation/faqs might be
warranted as a temporary stopgap for the unwary souls like myself. :)

Thanks for the help.

On Fri, Mar 5, 2010 at 6:07 AM, Dejan Muhamedagic <[email protected]> wrote:
> Hi,
>
> On Fri, Mar 05, 2010 at 10:38:24AM +0100, Andreas Kurz wrote:
>> On Thursday 04 March 2010 20:00:58 Brian Wolfe wrote:
>> > I replaced the strncmp() calls in the stonithd.c function for matching
>> > the node_name to the device hosts controlled list with the
>> > case-insensitive version strncasecmp() and it's working like a champ
>> > now.
>>
>> don't forget to post the patch, thx
>
> That's already been done in December and should be available in
> 1.0.7. See http://developerbugs.linux-foundation.org/show_bug.cgi?id=2292
>
>> > Are the node names case sensitive or insensitive? If they are
>> > insensitive then it might be a good idea to do all node name
>> > comparisons with the strncasecmp() call instead just to thwart any
>> > future cse issues. :)
>
> I doubt that's going to happen for anything else apart from when
> getting input from outside, such as with stonith devices.
> Heartbeat explicitly converts the node name to lowercase.
> Corosync seems to care only about IP addresses/node ids, so it
> shouldn't matter there. But it could matter in Pacemaker. At any
> rate, just keep your hostnames lowercase.
>
> Thanks,
>
> Dejan
>
>> maybe someone volunteers for that? ;-)
>>
>> Regards,
>> Andreas
>>
>> >
>> > On Thu, Mar 4, 2010 at 4:02 AM, Andreas Kurz <[email protected]>
>> wrote:
>> > > On Wednesday 03 March 2010 20:40:18 Brian Wolfe wrote:
>> > >> I have a cluster setup with 2 dell servers, dual ethernet heartbeats,
>> > >> and a single 8-port APCMaster PDU switch.  The cluster works except
>> > >> for one issue. The cloned stonithd interface refuses to make a call to
>> > >> the apcmaster to power down the node that's "dead". Reading through
>> > >> the logs I can see that during setup the stonithd asks the
>> > >> apcmastersnmp module to check it's hosts list and it returns the
>> > >> correct hostnames  "tpc-dal-prlores3 tpc-dal-tcfs2". However when the
>> > >> time comes for it to actually use the device I get the following
>> > >> message from stonithd refusing to actually kill the other node.
>> > >
>> > > hmm .... the outlet names of the PDU are also uppercase?
>> > >
>> > > Regards,
>> > > Andreas
>> > >
>> > >> Mar  3 13:00:36 TPC-DAL-TCFS2 crmd: [15805]: info: te_fence_node:
>> > >> Executing poweroff fencing operation (24) on TPC-DAL-PRLORES3
>> > >> (timeout=60000)
>> > >> Mar  3 13:00:36 TPC-DAL-TCFS2 crmd: [15805]: debug: waiting for the
>> > >> stonith reply msg.
>> > >> Mar  3 13:00:36 TPC-DAL-TCFS2 stonithd: [15800]: info: client tengine
>> > >> [pid: 15805] requests a STONITH operation POWEROFF on node
>> > >> TPC-DAL-PRLORES3
>> > >> Mar  3 13:00:36 TPC-DAL-TCFS2 stonithd: [15800]: info: we can't manage
>> > >> TPC-DAL-PRLORES3, broadcast request to other nodes
>> > >> Mar  3 13:00:36 TPC-DAL-TCFS2 stonithd: [15800]: debug: inserted
>> > >> optype=POWEROFF, key=-2
>> > >> Mar  3 13:00:36 TPC-DAL-TCFS2 stonithd: [15800]: info: Broadcasting
>> > >> the message succeeded: require others to stonith node
>> > >> TPC-DAL-PRLORES3.
>> > >> Mar  3 13:00:36 TPC-DAL-TCFS2 stonithd: [15800]: debug:
>> > >> stonithd_node_fence: sent back a synchronous reply.
>> > >> Mar  3 13:00:36 TPC-DAL-TCFS2 crmd: [15805]: debug:
>> > >> stonithd_node_fence:574: stonithd's synchronous answer is ST_APIOK
>> > >>
>> > >>
>> > >> The stonith is configured as follows:
>> > >>
>> > >>     <clone id="fencing" >
>> > >>         <primitive class="stonith" id="apcstonith23"
>> > >> type="apcmastersnmp" > <operations id="apcstonith23-operations" >
>> > >>           <op id="apcstonith23-op-monitor-15" interval="15"
>> > >> name="monitor" start-delay="15" timeout="15" />
>> > >>          </operations>
>> > >>  <instance_attributes id="apcstonith23-instance_attributes" >
>> > >>  <nvpair id="nvpair-604e339f-a400-4b30-82c0-f046de0ed663"
>> > >> name="ipaddr" value="172.20.1.23" />
>> > >> <nvpair id="nvpair-ed611421-97a1-4091-a5cd-8159f1230096" name="port"
>> > >> value="161" />
>> > >>  <nvpair id="nvpair-997431e2-ea78-4065-b835-f9149bbcb596"
>> > >> name="community" value="private" />
>> > >>  </instance_attributes>
>> > >> </primitive>
>> > >>  <meta_attributes id="fencing-meta_attributes" >
>> > >>   </meta_attributes>
>> > >> </clone>
>> > >>
>> > >>
>> > >> I can confirm the use of the stonith via the command "stonith -t
>> > >> apcmastersnmp <params> tpc-dal-prlores3" and it'll switch off the
>> > >> server.
>> > >>
>> > >> Any help would be appreciated.
>> > >> _______________________________________________
>> > >> Linux-HA mailing list
>> > >> [email protected]
>> > >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> > >> See also: http://linux-ha.org/ReportingProblems
>> >
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to