Re: [Pacemaker] stickiness weirdness please explain

Dan Frincu Thu, 24 Feb 2011 02:42:35 -0800

Hi,

On 02/23/2011 06:19 PM, Jelle de Jong wrote:

Dear Dan,


Thank you for taking the time to read and answer my question.

On 23-02-11 09:42, Dan Frincu wrote:

This is something that you should remove from the config, as I
understand it, all resources should run together on the same node and
migrate together to the other node.

    1.
       location cli-prefer-ip_virtual01 ip_virtual01 \
    2.
               rule $id="cli-prefer-rule-ip_virtual01" inf: #uname eq finley
    3.
       location cli-prefer-iscsi02_lun1 iscsi02_lun1 \
    4.
               rule $id="cli-prefer-rule-iscsi02_lun1" inf: #uname eq godfrey
    5.
       location cli-prefer-iscsi02_target iscsi02_target \
    6.
               rule $id="cli-prefer-rule-iscsi02_target" inf: #uname eq
       finley

I am sorry, I don’t know what I should do with these 6 rules?

After you put a node in standby, if it's the active node it will migratethe resources to the passive node and make that one active. However youmust remember to issue the command crm node online $nodename otherwisethe node will not be allowed to run resources on it. Just as a side note.

This simplifies resource design and thus keeping the cib smaller, while
achieving the same functional goal.

Output of ptest -LsVVV and some logs in a pastebin might help.

I changed my configuration according to your comments and the standby
and reboot of both nodes seems to works fine now! Thank you!

http://debian.pastebin.com/LuUGkRLd<  configuration and ptest output

However I still have the problem that I cant seem to move the resources
between nodes with the crm resource move command.

The way I used the crm move command was not to specify the node name. Ican't remember now why I did that (probably because I also used it on a2-node cluster), but the logic was use crm move groupname, and it willcreate a location constraint preventing the resources from the groupfrom running on the node that's currently primary. After the migrationof the resources has occured, in order to remove the location constraint(e.g.: allow the resources to move back if necessary) you must eitherremove the location constraint from the cib or use crm unmove groupname,I used the unmove command.


Just to be clear:

1. resources on finley ==> crm resource move ==> resources move togodfrey ==> crm resource unmove ==> resources remain on godfrey (we'vejust removed the constraint, but the resource stickiness prevents theping-pong effect)2. resources on godfrey ==> crm resource move ==> resources move tofinley ==> crm resource unmove ==> resources remain on finley (same as 1but from a different view)


Things to be aware of:

1. resources on a node ==> crm resource move ==> before the resourcesfinish migrating you issue crm resources unmove ==> the resources don'tfinish migrating to the other node and come back to the original node(so don't get finger happy on the keyboard, give the resources time tomove).2. resources on finley ==> crm resource move ==> resources move togodfrey ==> godfrey crashes ==> resources don't migrate to finley(because the crm resource unmove command was not issues, so the locationconstraint preventing the resources from running on finley is still inplace, even if finley is the last node in the cluster) ==> crm resourceunmove ==> resources start on finley


One thing to test would be to first remove any config that looks like this
location cli-prefer-rg_iscsi rg_iscsi \
        rule $id="cli-prefer-rule-rg_iscsi" inf: #uname eq finley

With reference either to finley or to godfrey. Reboot both nodes, letthem start and settle on a location, do a crm configure saveinitial.config. Issue the crm resource move (let them migrate), then crmconfigure save migrated.config, then crm resource unmove, then crmconfigure save unmigrated.config, and compare the results. This wayyou'll see how the setup looks and what rules are added and removedduring the process.

If the move command somehow doesn't work, you might want to take a lookif you've configured resource level fencing for DRBD,http://www.drbd.org/users-guide/s-pacemaker-fencing.htmlThe fence peer handler will add a constraint in some cases (such as whenyou put a node in standby) preventing the DRBD resource to run. When youbring a node online, and there have been disk changes and DRBD has tosync some data, until the data is synced the constraint is still there,so issuing a crm resource move while DRBD is syncing won't have theexpected outcome (again the reference to being finger happy on thekeyboard). After the sync is done, the crm-unfence-peer.sh removes theconstraint, then the move command will work.


Just a couple of things to keep in mind.

HTH,
Dan

Would you be willing to take a look at the pastebin config and ptest
output and maybe tell how to move the resources?

With kind regards,

Jelle de Jong


--
Dan Frincu
CCNA, RHCE


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Re: [Pacemaker] stickiness weirdness please explain

Reply via email to