------------------------------------------------------------------------ *From: *Lars Marowsky-Bree <l...@suse.com> *Sent: * 2013-12-06 13:44:53 E *To: *The Pacemaker cluster resource manager <pacemaker@oss.clusterlabs.org> *Subject: *Re: [Pacemaker] monitor on-fail=ignore not restarting when resource reported as stopped > On 2013-12-06T11:21:02, Patrick Hemmer <pacema...@feystorm.net> wrote: > >>> So where is the problem? If the script returns "ERROR" than pacemaker has >>> to >>> acct accordingly. >> If the script returns "ERROR" the `on-fail=ignore` should make it do >> nothing. Amazon's API failed, we need to just retry again later. >> If the script returns "STOPPED", this isn't an error. The script queried >> the resource, found it was stopped, and reported it as stopped. >> Pacemaker should act accordingly and start it back up. > For a resource that pacemaker expects to be started, it's an error if it > is found to be stopped. Pacemaker can't tell if it is really cleanly > stopped, or died, or ... Oh, and I'll quote the OCF spec on this one: 1 generic or unspecified error (current practice) The "monitor" operation shall return this for a crashed, hung or otherwise non-functional resource. 7 program is not running Note: This is not the error code to be returned by a successful "stop" operation. A successful "stop" operation shall return 0. The "monitor" action shall return this value only for a _cleanly_ stopped resource. If in doubt, it should return 1. So the OCF spec very clearly states that OCF_ERR_GENERIC means it's failed. OCF_NOT_RUNNING means it shut down cleanly. So yes, pacemaker can tell if it cleanly stopped. > > If you want Pacemaker to recover failed resources, do not set > on-fail="ignore". I still don't quite get why you set that when you > obviously don't want the associated behaviour? > > > Regards, > Lars >
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org