On 9/29/07, Rob Aronson <[EMAIL PROTECTED]> wrote:
> I have a 2 node cluster on SLES10.1 running groupwise. The resources are
> made up of a filesystem, an ip address and an executable. I have a WTI nps
> 400 for stonith. I built the cluster using the HAgui.
>
> Today when I was doing some testing of forced failover, I ifdowned the
> interface the ipaddress resouce was bound to. It should have forced the
> resource group to failover.  One of my IP address resources got stuck in a
> failed state, the cluster kept trying to tell the failed node to stop the
> address resource. I know I must have a configuration wrong, but I don't know
> which one.

it can't fail over until the resource is confirmed stopped on the old node.

if the stop fails, then we would have issued a stonith request,
however this will have failed (repeatedly) since your stonith
resources aren't running:

Sep 29 09:37:12 gwcluster1 pengine: [18702]: WARN: native_color:
Resource resource_Stonith:0 cannot run anywhere
Sep 29 09:37:12 gwcluster1 pengine: [18702]: WARN: native_color:
Resource resource_Stonith:1 cannot run anywhere

this explains why the resource was never moved.

> The second problem I have is with stonith. It looks like it should work and
> yet instead of shooting anyone it just dies. I don't get that one either.

can you be a little more specific than "just dies" ?

> here's my config file and some excerpts of the log.
>
>
> Thanks
> --
> Rob Aronson
> Storage, Virtualization and Orchestration Practice Manager, Novacoast
> USA
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to