On Mon, Oct 11, 2010 at 9:40 AM, Dan Frincu <[email protected]> wrote: > Hi all, > > I've managed to make this setup work, basically the issue with a > symmetric-cluster="false" and specifying the resources' location manually > means that the resources will always obey the location constraint, and (as > far as I could see) disregard the rsc_defaults resource-stickiness values.
This definitely should not be the case. Possibly your stickiness setting is being eclipsed by the combination of the location constraint scores. Try INFINITY instead. > This behavior is not the expected one, in theory, setting > symmetric-cluster="false" should affect whether resources are allowed to run > anywhere by default and the resource-stickiness should lock in place the > resources so they don't bounce from node to node. Again, this didn't happen, > but by setting symmetric-cluster="true", using the same ordering and > collocation constraints and the resource-stickiness, the behavior is the > expected one. > > I don't remember seeing anywhere in the docs from clusterlabs.org being > mentioned that the resource-stickiness only works on > symmetric-cluster="true", so for anyone that also stumbles upon this issue, > I hope this helps. > > Regards, > > Dan > > Dan Frincu wrote: >> >> Hi, >> >> Since it was brought to my attention that I should upgrade from >> openais-0.80 to a more recent version of corosync, I've done just that, >> however I'm experiencing a strange behavior on the cluster. >> >> The same setup was used with the below packages: >> >> # rpm -qa | grep -i "(openais|cluster|heartbeat|pacemaker|resource)" >> openais-0.80.5-15.2 >> cluster-glue-1.0-12.2 >> pacemaker-1.0.5-4.2 >> cluster-glue-libs-1.0-12.2 >> resource-agents-1.0-31.5 >> pacemaker-libs-1.0.5-4.2 >> pacemaker-mgmt-1.99.2-7.2 >> libopenais2-0.80.5-15.2 >> heartbeat-3.0.0-33.3 >> pacemaker-mgmt-client-1.99.2-7.2 >> >> Now I've migrated to the most recent stable packages I could find (on the >> clusterlabs.org website) for RHEL5: >> >> # rpm -qa | grep -i "(openais|cluster|heartbeat|pacemaker|resource)" >> cluster-glue-1.0.6-1.6.el5 >> pacemaker-libs-1.0.9.1-1.el5 >> pacemaker-1.0.9.1-1.el5 >> heartbeat-libs-3.0.3-2.el5 >> heartbeat-3.0.3-2.el5 >> openaislib-1.1.3-1.6.el5 >> resource-agents-1.0.3-2.el5 >> cluster-glue-libs-1.0.6-1.6.el5 >> openais-1.1.3-1.6.el5 >> >> Expected behavior: >> - all the resources the in group should go (based on location preference) >> to bench1 >> - if bench1 goes down, resources migrate to bench2 >> - if bench1 comes back up, resources stay on bench2, unless manually told >> otherwise. >> >> On the previous incantation, this worked, by using the new packages, not >> so much. Now if bench1 goes down (crm node standby `uname -n`), failover >> occurs, but when bench1 comes backup up, resources migrate back, even if >> default-resource-stickiness is set, and more than that, 2 drbd block devices >> reach infinite metrics, most notably because they try to promote the >> resources to a Master state on bench1, but fail to do so due to the resource >> being held open (by some process, I could not identify it). >> >> Strangely enough, the resources (drbd) fail to be promoted to a Master >> status on bench1, so they fail back to bench2, where they are mounted >> (functional), but crm_mon shows: >> >> Migration summary: >> * Node bench2.streamwide.ro: >> drbd_mysql:1: migration-threshold=1000000 fail-count=1000000 >> drbd_home:1: migration-threshold=1000000 fail-count=1000000 >> * Node bench1.streamwide.ro: >> >> .... infinite metrics on bench2, while the drbd resources are available >> >> version: 8.3.2 (api:88/proto:86-90) >> GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by >> [email protected], 2009-08-29 14:07:55 >> 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- >> ns:1632 nr:1864 dw:3512 dr:3933 al:11 bm:19 lo:0 pe:0 ua:0 ap:0 ep:1 >> wo:b oos:0 >> 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- >> ns:4 nr:24 dw:28 dr:25 al:1 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 >> 2: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- >> ns:4 nr:24 dw:28 dr:85 al:1 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 >> >> and mounted >> >> /dev/drbd1 on /home type ext3 (rw,noatime,nodiratime) >> /dev/drbd0 on /mysql type ext3 (rw,noatime,nodiratime) >> /dev/drbd2 on /storage type ext3 (rw,noatime,nodiratime) >> >> Attached is the hb_report. >> >> Thank you in advance. >> >> Best regards >> > > -- > Dan FRINCU > Systems Engineer > CCNA, RHCE > Streamwide Romania > > > _______________________________________________ > Pacemaker mailing list: [email protected] > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > _______________________________________________ Pacemaker mailing list: [email protected] http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
