Hi, I know I should use it only for development purposes and that's what I am doing right now.
I have configured my cluster and a resource stonith_ssh. I configured stonith_ssh to use the external IP address to connect to the other server. When I ran: 'stonith -t ssh -p "nagios1.hq.xxx" -T reset nagios1.hq.xxx' it worked. The other node rebooted immediately. But when I pulled the cross-over cable the results were weird. Each node had the same entry in the logfile which looked like this: Feb 6 12:57:59 nagios2 tengine: [27786]: info: te_fence_node: Executing reboot fencing operation (63) on nagios1 (timeout=30000) Feb 6 12:57:59 nagios2 stonithd: [27368]: info: client tengine [pid: 27786] want a STONITH operation RESET to node nagios1. Feb 6 12:57:59 nagios2 stonithd: [27368]: info: Broadcasting the message succeeded: require others to stonith node nagios1. Then later it said: Feb 6 12:58:29 nagios2 stonithd: [27368]: ERROR: Failed to STONITH the node nagios1: optype=RESET, op_result=TIMEOUT I'm not sure why it did not reboot the other machine. Does 'require others' mean that it won't work in a 2-node setup? The really weird part is that even though no node was rebooted, only one node continued to run the resources, whereas the other was just doing nothing at all. When I reconnected the cross-over cable both nodes went into 'standby' state for a couple of seconds but everything was working fine. Regards, Tobi _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
