Alex Balashov wrote:
Greetings,

I am using a custom OCF RA and Heartbeat v2 + CRM/CIB for monitoring a custom service at the application level in an active-passive binary cluster.

When the service is detected as failing on the first node, the resource manager tries to restart the service. I've set effective service and failure stickiness to almost zero so if it fails to start, it will fail over all the resources to the secondary node.

What I want to know is whether it's possible to fail the service over immediately the moment a single monitor procedure fails, no questions asked, without any attempts to restart. If so, what cluster property sets should I set and how?

Set default-resource-failure-stickiness to -infinity.

cibadmin -U -o crm_config -X '<cluster_property_set id="cib-bootstrap-options"><nvpair id="someid" name="default-resource-failure-stickiness" value="-infinity"/></cluster_property_set>'

should do.

Whichever monitor operation fails will render the resource unrunnable on the node it failed on and the cluster will choose another node and start the resource there.

In order to ever be able to run that resource on this node again, you have to reset the particular failcount.

If you used pacemaker 1.0 you would not have to deal with failure-stickiness anymore, but could use the very nice new "migration-threshold" feature. Set this to 1 and after 1 failure, the resource will failover, regardless of its score.

Regards
Dominik
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to