See below; Gary Romo IBM Global Technology Services 303.458.4415 Email: [EMAIL PROTECTED] Pager:1.877.552.9264 Text message: [EMAIL PROTECTED]
jim parsons <[EMAIL PROTECTED]> Sent by: [EMAIL PROTECTED] 01/17/2008 03:40 PM Please respond to linux clustering <[email protected]> To linux clustering <[email protected]> cc [EMAIL PROTECTED] Subject Re: [Linux-cluster] BladeCenter Fencing errors On Thu, 2008-01-17 at 14:06 -0700, Gary Romo wrote: > > I enabled telnet on the MM, now I am getting these messsages; > > Jan 17 14:00:24 node1 fenced[3229]: fence "node2" failed > Jan 17 14:00:29 node1 fenced[3229]: fencing node "node2" > Jan 17 14:00:40 node1 fenced[3229]: agent "fence_bladecenter" reports: > pattern match timed-out at /sbin/fence_bladecenter line 189 > > Jan 17 14:00:40 node1 fenced[3229]: fence "node2" failed > Jan 17 14:00:45 node1 fenced[3229]: fencing node "node2" > Jan 17 14:00:56 node1 fenced[3229]: agent "fence_bladecenter" reports: > pattern match timed-out at /sbin/fence_bladecenter line 189 > > Jan 17 14:00:56 node1 fenced[3229]: fence "node2" failed > Jan 17 14:01:01 node1 fenced[3229]: fencing node "node2" > Jan 17 14:01:12 node1 fenced[3229]: agent "fence_bladecenter" reports: > pattern match timed-out at /sbin/fence_bladecenter line 189 > > Line 189 looks like this; > > ($text, $match) = $t->waitfor("/system:blade\\[$bladenum\\]>/"); > > > I am getting these on thesecond node; > > Jan 17 14:03:24 mode2 fenced[3340]: fence "node1" failed > Jan 17 14:03:29 node2 fenced[3340]: fencing node "node1" > Jan 17 14:03:29 node2 fenced[3340]: fence "node1" failed > Jan 17 14:03:34 node2 fenced[3340]: fencing node "node1" > Jan 17 14:03:34 node2 fenced[3340]: fence "node1" failed > Ah, yuck. Well, let's figure out what is going on here. Can you post the clusternodes and fencedevices sections of your cluster.conf here? Just XXXX out any passwords. <?xml version="1.0"?> <cluster alias="rhcs-1-clus" config_version="4" name="rhcs-1-clus"> <fence_daemon post_fail_delay="0" post_join_delay="3"/> <clusternodes> <clusternode name="node1" votes="1"> <multicast addr="XXX.XXX.127.204" interface="eth0"/> <fence> <method name="1"> <device blade="2" name="chassis_fence"/> </method> </fence> </clusternode> <clusternode name="node2" votes="1"> <multicast addr="XXX.XXX.127.204" interface="eth0"/> <fence> <method name="1"> <device blade="3" name="chassis_fence"/> </method> </fence> </clusternode> </clusternodes> <cman expected_votes="1" two_node="1"> <multicast addr="XXX.XXX.127.204"/> </cman> <fencedevices> <fencedevice agent="fence_bladecenter" ipaddr="XXX.XXX.1.143" login="rchs_fence" name="chassis_fence" passwd="XXXXXXX"/> </fencedevices> On one of the cluster nodes, can you run '/sbin/fence_bladecenter -a <ip or hostname of bladecenter> -l <login> -p <passwd> -n <blade number of another running node> -o status -v' [EMAIL PROTECTED] ~]# /sbin/fence_bladecenter -a chassis -l rchs_fence -p XXXXXXX -n 2 -o status -v Please use '-h' for usage. Do you know firmware details about your bladecenter? The fence_bladecenter script hasn't changed in years...The tested firmware versions are in the top of the file. Maybe the interface has changed. If so, the debuglog should give us information. 1 chassis Main application BRET85M CNETMNUS.PKT 01-10-07 16 Boot ROM* BRBR82A CNETBRUS.PKT 06-01-05 16 Remote control BRRG85M CNETRGUS.PKT 01-10-07 16 This will get us started. -Jim -- Linux-cluster mailing list [email protected] https://www.redhat.com/mailman/listinfo/linux-cluster
-- Linux-cluster mailing list [email protected] https://www.redhat.com/mailman/listinfo/linux-cluster
