On 9/29/07, Rob Aronson <[EMAIL PROTECTED]> wrote: > I have a 2 node cluster on SLES10.1 running groupwise. The resources are > made up of a filesystem, an ip address and an executable. I have a WTI nps > 400 for stonith. I built the cluster using the HAgui. > > Today when I was doing some testing of forced failover, I ifdowned the > interface the ipaddress resouce was bound to. It should have forced the > resource group to failover. One of my IP address resources got stuck in a > failed state, the cluster kept trying to tell the failed node to stop the > address resource. I know I must have a configuration wrong, but I don't know > which one.
it can't fail over until the resource is confirmed stopped on the old node. if the stop fails, then we would have issued a stonith request, however this will have failed (repeatedly) since your stonith resources aren't running: Sep 29 09:37:12 gwcluster1 pengine: [18702]: WARN: native_color: Resource resource_Stonith:0 cannot run anywhere Sep 29 09:37:12 gwcluster1 pengine: [18702]: WARN: native_color: Resource resource_Stonith:1 cannot run anywhere this explains why the resource was never moved. > The second problem I have is with stonith. It looks like it should work and > yet instead of shooting anyone it just dies. I don't get that one either. can you be a little more specific than "just dies" ? > here's my config file and some excerpts of the log. > > > Thanks > -- > Rob Aronson > Storage, Virtualization and Orchestration Practice Manager, Novacoast > USA > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
