On Tue, Mar 25, 2008 at 5:49 PM, Niels de Carpentier
<[EMAIL PROTECTED]> wrote:
> I'm having an issue with heartbeat, where it has running resources on a
>  server which is offline. The crm_mon status is:
>
>  ============
>  Last updated: Tue Mar 25 16:43:38 2008
>  Current DC: front001(7c5cc43a-0601-4924-afa5-1bf1f29efcfb)
>  5 Nodes configured.
>  13 Resources configured.
>  ============
>
>  Node: front001 (7c5cc43a-0601-4924-afa5-1bf1f29efcfb): online
>  Node: front005 (7bbcb330-7d41-445f-ada0-a61d1046863e): online
>  Node: front004 (77e65606-f308-4c59-8493-65b1b571a2ab): online
>  Node: front003 (3d8b08af-160c-4268-9446-20321e8803aa): OFFLINE
>  Node: front002 (607f465e-3fef-4dd3-8afc-35dadec069ec): online
>
>  Full list of resources:
>
>  xxxxxx001       (heartbeat::ocf:xen):   Started front001
>  yyyyyy001       (heartbeat::ocf:xen):   Started front001
>  yyyyyy002       (heartbeat::ocf:xen):   Started front002
>  xxx001  (heartbeat::ocf:xen):   Started front003
>  xxx002  (heartbeat::ocf:xen):   Started front004
>  xx001   (heartbeat::ocf:xen):   Started front003
>  xx002   (heartbeat::ocf:xen):   Started front004
>  xxxxxx002       (heartbeat::ocf:xen):   Started front002
>  stonith_front001        (stonith:external/ipmitool)[    front005
>  front003 ]
>  stonith_front002        (stonith:external/ipmitool):    Started front001
>  stonith_front003        (stonith:external/ipmitool):    Started front005
>  FAILED
>  stonith_front004        (stonith:external/ipmitool):    Started front001
>  stonith_front005        (stonith:external/ipmitool):    Started front001
>
>  Failed actions:
>     stonith_front003_start_0 (node=front005, call=2523, rc=1): Error
>     stonith_front003_start_0 (node=front001, call=74, rc=1): Error
>     stonith_front003_start_0 (node=front004, call=22, rc=1): Error
>     stonith_front003_monitor_0 (node=front002, call=20, rc=14): Error
>     stonith_front003_start_0 (node=front002, call=22, rc=1): Error
>
>
>  The front003 has a hardware failure, so it is to be expected that the
>  stonith action will fail. ( This is a custom stonith script, so there
>  might be some bugs left in it. The xen ocf script is also a custom one )
>
>  The real problem is that it shows 2 resources running on the front003,
>  while this server is obviously offline. It should move the resources to
>  one of the other servers, but doesn't for some reason.

How can it?  It's offline remember.
Or at least it _appears_ offline which is the whole point of
STONITH... to make _sure_ its offline before starting the resources
elsewhere.

So until the STONITH command succeeds, the resources wont be moved.
They show up as running on that node because as far as the cluster can
confirm... they still are.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to