Re: [Linux-HA] Two node cluster monitoring configuration to ignore failing on restart

Florian Crouzat Mon, 03 Oct 2011 00:50:41 -0700

Florian Crouzat wrote on 2011-09-29:

> Hi,
>
> I'm running a two node cluster where all the resources have to run on
> the same node and failed resources must not trigger anything.
>
> I'm having trouble configuring the following behavior:
>  * A lsb resource "foo" is monitored every 10 seconds and the cluster
> must
> try to restart it on bad status return code ;
>  * If the restart-on-failure fails, I don't want to do anything more
> yet,
> just keep on going with a failed resource.
>
> All my resources being linked by collocation and order, right now a
> failing restart on my resource moves everything to the other node.
>
> My test case is to put "exit 4" in the foo initscript in the start
> section and issue 'kill -KILL $(pidof foo)'.
>
> I tried the following configuration:
>
> primitive bind lsb:foo \
>         meta target-role="Started" \
>         op monitor on-fail="restart" interval="10s" \
>         op start on-fail="ignore" interval="0"
> and
>
> primitive bind lsb:foo \
>         meta target-role="Started" \
>         op monitor on-fail="restart" interval="10s"
> OCF_CHECK_LEVEL="10" \
>         op monitor on-fail="ignore" interval="60s" OCF_CHECK_LEVEL="20"
> I believe the first configuration I tried doesn't work because the "op
> start" is only used on /real/ start of the service, not a restart
> issued by
> the "op monitor" and, I don't really understand the second
> configuration but
> it doesn't work either.
>
> The "restart-on-failure" part is really easy and works, alone. But I
> just can't find a way to ignore a failing restart.
>
> Any help appreciated.


Digging up this issue, as I'm still failing at configuring such a behavior
and can't move forward.
Is it even possible to have a on-fail="restart" that won't move all the
resources if the restart actually fails ?

Thank you.

Florian

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Two node cluster monitoring configuration to ignore failing on restart

Reply via email to