> > I've set up a five-node HA cluster running a bunch of services, and > > all seems to be going well, although one node insists on telling > > everyone else that its running any new resource which is set-up but > > not activated. I use crm_resource -C to deal with this > situation, so > > it's not so bad. > > > > Anyway, these nodes are HP DL380-G5s, and so I'm using the riloe > > STONITH script. I tested it by hand, and it works like a charm. > > How did you test it?
stonith -t external/riloe -p <param list> -T reset It worked, so I configured the stonith resource for each machine by specifying all of the parameters in the same way as I did here. > > I've set up the STONITH resources so that they never run on > the same > > machine as the one they control. The other day, I > artificially caused > > a situation in which one of the nodes should have been fenced. The > > cluster realized this and "scheduled" it for fencing, but the fence > > never happened. I'm wondering what this "scheduling" is, and what > > parameters are available to control it? > > I suppose that when you say "scheduled" you're referring to a > log message. That means that the cluster (CRM) decided that a > node should be fenced. If that didn't happen then your > stonith module doesn't work. There should've been an error > message in the logs. > You can test your setup using the stonith program (see the > stonith(8) man page for details). If it doesn't work as you > expect, turn debugging on with the -d option. stonith on the command-line did work, and I configured the stonith resources in the same way. CRM never got around to actually doing the fencing, and so the logs never said anything more than "node x scheduled for fencing". It never even tried to fence. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
