To be more specific:

I've tried following the example on page 25/26 of this document to the teeth: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

And it does work as advertised. When I stop corosync, the resource goes to the 
other node. I start corosync and it remains there as it should.

However, if I simply unplug the ethernet connection, let the resource migrate, 
then plug it back in, it will fail back to the original node. Is this the 
intended behavior? It seems a bad NIC could wreck havoc on such a setup.

Thanks!

Daniel

On May 16, 2011, at 5:33 PM, Daniel Bozeman wrote:

> For the life of me, I cannot prevent auto-failback from occurring in a 
> master-slave setup I have in virtual machines. I have a very simple 
> configuration:
> 
> node $id="4fe75075-333c-4614-8a8a-87149c7c9fbb" ha2 \
>        attributes standby="off"
> node $id="70718968-41b5-4aee-ace1-431b5b65fd52" ha1 \
>        attributes standby="off"
> primitive FAILOVER-IP ocf:heartbeat:IPaddr \
>        params ip="192.168.1.79" \
>        op monitor interval="10s"
> primitive PGPOOL lsb:pgpool2 \
>        op monitor interval="10s"
> group PGPOOL-AND-IP FAILOVER-IP PGPOOL
> colocation IP-WITH-PGPOOL inf: FAILOVER-IP PGPOOL
> property $id="cib-bootstrap-options" \
>        dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
>        cluster-infrastructure="Heartbeat" \
>        stonith-enabled="false" \
>        no-quorum-policy="ignore"
> rsc_defaults $id="rsc-options" \
>        resource-stickiness="1000"
> 
> No matter what I do with resource stickiness, I cannot prevent fail-back. I 
> usually don't have a problem with failback when I restart the current master, 
> but when I disable network connectivity to the master, everything fails over 
> fine. Then I enable the network adapter and everything jumps back to the 
> original "failed" node. I've done some "watch ptest -Ls"ing, and the scores 
> seem to signify that failback should not occur. I'm also seeing resources 
> bounce more times than necessary when a node is added (~3 times each) and 
> resources seem to bounce when a node returns to the cluster even if it isn't 
> necessary for them to do so. I also had an order directive in my 
> configuration at one time, and often the second resource would start, then 
> stop, then allow the first resource to start, then start itself. Quite weird. 
> Any nods in the right direction would be greatly appreciated. I've scoured 
> Google and read the official documentation to no avail. I suppose I should 
> mention I am using heartbeat as well. My LSB resource implements 
> start/stop/status properly without error.
> 
> I've been testing this with a floating IP + Postgres as well with the same 
> issues. One thing I notice is that my "group" resources have no score. Is 
> this normal? There doesn't seem to be any way to assign a stickiness to a 
> group, and default stickiness has no effect.
> 
> Thanks!
> 
> Daniel Bozeman

Daniel Bozeman
American Roamer
Systems Administrator
daniel.boze...@americanroamer.com

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Reply via email to