Re: [Pacemaker] Preventing auto-fail-back

Max Williams Wed, 18 May 2011 01:41:40 -0700

Hi Daniel,
You might want to set "on-fail=standby" for the resource group or individual 
resources. This will put the host in to standby when a failure occurs thus 
preventing failback:
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-resource-operations.html#s-resource-failure


Another option is to set resource stickiness which will stop resources moving 
back after a failure:
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch05s03s02.html

Also note if you are using a two node cluster you will also need the property 
"no-quorum-policy=ignore" set.

Hope that helps!
Cheers,
Max

From: Daniel Bozeman [mailto:daniel.boze...@americanroamer.com]
Sent: 17 May 2011 19:09
To: pacemaker@oss.clusterlabs.org
Subject: Re: [Pacemaker] Preventing auto-fail-back

To be more specific:

I've tried following the example on page 25/26 of this document to the teeth: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

And it does work as advertised. When I stop corosync, the resource goes to the 
other node. I start corosync and it remains there as it should.

However, if I simply unplug the ethernet connection, let the resource migrate, 
then plug it back in, it will fail back to the original node. Is this the 
intended behavior? It seems a bad NIC could wreck havoc on such a setup.

Thanks!

Daniel

On May 16, 2011, at 5:33 PM, Daniel Bozeman wrote:


For the life of me, I cannot prevent auto-failback from occurring in a 
master-slave setup I have in virtual machines. I have a very simple 
configuration:

node $id="4fe75075-333c-4614-8a8a-87149c7c9fbb" ha2 \
       attributes standby="off"
node $id="70718968-41b5-4aee-ace1-431b5b65fd52" ha1 \
       attributes standby="off"
primitive FAILOVER-IP ocf:heartbeat:IPaddr \
       params ip="192.168.1.79" \
       op monitor interval="10s"
primitive PGPOOL lsb:pgpool2 \
       op monitor interval="10s"
group PGPOOL-AND-IP FAILOVER-IP PGPOOL
colocation IP-WITH-PGPOOL inf: FAILOVER-IP PGPOOL
property $id="cib-bootstrap-options" \
       dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
       cluster-infrastructure="Heartbeat" \
       stonith-enabled="false" \
       no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
       resource-stickiness="1000"

No matter what I do with resource stickiness, I cannot prevent fail-back. I 
usually don't have a problem with failback when I restart the current master, 
but when I disable network connectivity to the master, everything fails over 
fine. Then I enable the network adapter and everything jumps back to the 
original "failed" node. I've done some "watch ptest -Ls"ing, and the scores 
seem to signify that failback should not occur. I'm also seeing resources 
bounce more times than necessary when a node is added (~3 times each) and 
resources seem to bounce when a node returns to the cluster even if it isn't 
necessary for them to do so. I also had an order directive in my configuration 
at one time, and often the second resource would start, then stop, then allow 
the first resource to start, then start itself. Quite weird. Any nods in the 
right direction would be greatly appreciated. I've scoured Google and read the 
official documentation to no avail. I suppose I should mention I am using 
heartbeat as well. My LSB resource implements start/stop/status properly 
without error.

I've been testing this with a floating IP + Postgres as well with the same 
issues. One thing I notice is that my "group" resources have no score. Is this 
normal? There doesn't seem to be any way to assign a stickiness to a group, and 
default stickiness has no effect.

Thanks!

Daniel Bozeman

Daniel Bozeman
American Roamer
Systems Administrator
daniel.boze...@americanroamer.com<mailto:daniel.boze...@americanroamer.com>


________________________________________________________________________
In order to protect our email recipients, Betfair Group use SkyScan from 
MessageLabs to scan all Incoming and Outgoing mail for viruses.

________________________________________________________________________

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Re: [Pacemaker] Preventing auto-fail-back

Reply via email to