Hi,

I know I should use it only for development purposes and that's what I
am doing right now.

I have configured my cluster and a resource stonith_ssh. I configured
stonith_ssh to use the external IP address to connect to the other
server. When I ran: 'stonith -t ssh -p "nagios1.hq.xxx" -T reset
nagios1.hq.xxx' it worked. The other node rebooted immediately. 

But when I pulled the cross-over cable the results were weird. Each node
had the same entry in the logfile which looked like this:

Feb  6 12:57:59 nagios2 tengine: [27786]: info: te_fence_node: Executing
reboot fencing operation (63) on nagios1 (timeout=30000)
Feb  6 12:57:59 nagios2 stonithd: [27368]: info: client tengine [pid:
27786] want a STONITH operation RESET to node nagios1.
Feb  6 12:57:59 nagios2 stonithd: [27368]: info: Broadcasting the
message succeeded: require others to stonith node nagios1.

Then later it said:
Feb  6 12:58:29 nagios2 stonithd: [27368]: ERROR: Failed to STONITH the
node nagios1: optype=RESET, op_result=TIMEOUT

I'm not sure why it did not reboot the other machine. Does 'require
others' mean that it won't work in a 2-node setup? 

The really weird part is that even though no node was rebooted, only one
node continued to run the resources, whereas the other was just doing
nothing at all. When I reconnected the cross-over cable both nodes went
into 'standby' state for a couple of seconds but everything was working
fine.

Regards,
Tobi

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to