Re: [Pacemaker] Help with N+1 configuration

Phil Frost Fri, 27 Jul 2012 08:59:22 -0700

On 07/27/2012 11:48 AM, Cal Heldenbrand wrote:

Why wouldn't my mem3 failover happen if it timed out stopping thecluster IP?

If a stop action fails, pacemaker can't know if the resource is running,not running, or in some other broken state. The cluster is in an unknownstate, and there's no reasonable thing pacemaker can do. Since pacemakerthinks a node is broken (it failed to stop a resource, as requested) butisn't sure, the solution is to transition to a known state by poweringthe node off, resetting it, or otherwise fencing it. Configure a STONITHresource to do this. Without STONITH, your only option is to manuallyaddress the cause of the failure (high load, in this case), then issue"crm resource cleanup ..." on any failed resources to instruct pacemakerthat it is safe to try again.



_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Help with N+1 configuration

Reply via email to