Alex Balashov wrote:
Greetings,
I am using a custom OCF RA and Heartbeat v2 + CRM/CIB for monitoring a
custom service at the application level in an active-passive binary
cluster.
When the service is detected as failing on the first node, the resource
manager tries to restart the service. I've set effective service and
failure stickiness to almost zero so if it fails to start, it will fail
over all the resources to the secondary node.
What I want to know is whether it's possible to fail the service over
immediately the moment a single monitor procedure fails, no questions
asked, without any attempts to restart. If so, what cluster property
sets should I set and how?
Set default-resource-failure-stickiness to -infinity.
cibadmin -U -o crm_config -X '<cluster_property_set
id="cib-bootstrap-options"><nvpair id="someid"
name="default-resource-failure-stickiness"
value="-infinity"/></cluster_property_set>'
should do.
Whichever monitor operation fails will render the resource unrunnable on
the node it failed on and the cluster will choose another node and start
the resource there.
In order to ever be able to run that resource on this node again, you
have to reset the particular failcount.
If you used pacemaker 1.0 you would not have to deal with
failure-stickiness anymore, but could use the very nice new
"migration-threshold" feature. Set this to 1 and after 1 failure, the
resource will failover, regardless of its score.
Regards
Dominik
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems