Florian Crouzat wrote onĀ 2011-09-29: > Hi, > > I'm running a two node cluster where all the resources have to run on > the same node and failed resources must not trigger anything. > > I'm having trouble configuring the following behavior: > * A lsb resource "foo" is monitored every 10 seconds and the cluster > must > try to restart it on bad status return code ; > * If the restart-on-failure fails, I don't want to do anything more > yet, > just keep on going with a failed resource. > > All my resources being linked by collocation and order, right now a > failing restart on my resource moves everything to the other node. > > My test case is to put "exit 4" in the foo initscript in the start > section and issue 'kill -KILL $(pidof foo)'. > > I tried the following configuration: > > primitive bind lsb:foo \ > meta target-role="Started" \ > op monitor on-fail="restart" interval="10s" \ > op start on-fail="ignore" interval="0" > and > > primitive bind lsb:foo \ > meta target-role="Started" \ > op monitor on-fail="restart" interval="10s" > OCF_CHECK_LEVEL="10" \ > op monitor on-fail="ignore" interval="60s" OCF_CHECK_LEVEL="20" > I believe the first configuration I tried doesn't work because the "op > start" is only used on /real/ start of the service, not a restart > issued by > the "op monitor" and, I don't really understand the second > configuration but > it doesn't work either. > > The "restart-on-failure" part is really easy and works, alone. But I > just can't find a way to ignore a failing restart. > > Any help appreciated.
Digging up this issue, as I'm still failing at configuring such a behavior and can't move forward. Is it even possible to have a on-fail="restart" that won't move all the resources if the restart actually fails ? Thank you. Florian
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
