I've been trying to debug and do a root cause analysis for a cascading
series of failures that a customer hit a couple of days ago, that
caused their filesystem to be unavailable for a couple of hours.
The original failure was in our own distributed filesystem backend, a
fork of LizardFS, which
I have a master/slave resource (with a custom resource agent) which,
if it uncleanly shut down, will return OCF_FAILED_MASTER on the next
monitor operation. This seems to be what
http://www.linux-ha.org/doc/dev-guides/_literal_ocf_failed_master_literal_9.html
suggests that exit code should be used