Re: [ClusterLabs] Non-cloned resource moves before cloned resource startup on unstandby

Dan Ragle Tue, 11 Sep 2018 06:33:34 -0700


On 9/11/2018 9:20 AM, Dan Ragle wrote:



On 9/11/2018 1:59 AM, Andrei Borzenkov wrote:

07.09.2018 23:07, Dan Ragle пишет:

On an active-active two node cluster with DRBD, dlm, filesystem mounts,
a Web Server, and some crons I can't figure out how to have the crons
jump from node to node in the correct order. Specifically, I have two
crontabs (managed via symlink creation/deletion) which normally will run
one on node1 and the other on node2. When a node goes down, I want both
to run on the remaining node until the original node comes back up, at
which time they should split the nodes again. However, when returning to
the original node the crontab that is being moved must wait until the
underlying FS mount is done on the original node before jumping.

DRBD, dlm, the filesystem mounts and the Web Server are all working as
expected; when I mark the second node as standby Apache stops, the FS
unmounts, dlm stops, and DRBD stops on the node; and when I mark that
same node unstandby the reverse happens as expected. All three of those
are cloned resources.

The crontab resources are not cloned and create symlinks, one resource
preferring the first node and the other preferring the second. Each is
colocated and order dependent on the filesystem mounts (which in turn
are colocated and dependent on dlm, which in turn is colocated and
dependent on DRBD promotion). I thought this would be sufficient, but
when the original node is marked unstandby the crontab that prefers to
be on that node attempts to jump over immediately before the FS is
mounted on that node. Of course the crontab link fails because the
underlying filesystem hasn't been mounted yet.

pcs version is 0.9.162.

Here's the obfuscated detailed list of commands for the config. I'm
still trying to set it up so it's not production-ready yet, but want to
get this much sorted before I add too much more.

# pcs config export pcs-commands
#!/usr/bin/sh
# sequence generated on 2018-09-07 15:21:15 with: clufter 0.77.0
# invoked as: ['/usr/sbin/pcs', 'config', 'export', 'pcs-commands']
# targeting system: ('linux', 'centos', '7.5.1804', 'Core')
# using interpreter: CPython 2.7.5
pcs cluster auth node1.mydomain.com node2.mydomain.com <> /dev/tty
pcs cluster setup --name MyCluster \
   node1.mydomain.com node2.mydomain.com --transport udpu
pcs cluster start --all --wait=60
pcs cluster cib tmp-cib.xml
cp tmp-cib.xml tmp-cib.xml.deltasrc
pcs -f tmp-cib.xml property set stonith-enabled=false
pcs -f tmp-cib.xml property set no-quorum-policy=freeze
pcs -f tmp-cib.xml resource defaults resource-stickiness=100
pcs -f tmp-cib.xml resource create DRBD ocf:linbit:drbd drbd_resource=r0 \
   op demote interval=0s timeout=90 monitor interval=60s notify
interval=0s \
   timeout=90 promote interval=0s timeout=90 reload interval=0s timeout=30 \
   start interval=0s timeout=240 stop interval=0s timeout=100
pcs -f tmp-cib.xml resource create dlm ocf:pacemaker:controld \
   allow_stonith_disabled=1 \
   op monitor interval=60s start interval=0s timeout=90 stop interval=0s \
   timeout=100
pcs -f tmp-cib.xml resource create WWWMount ocf:heartbeat:Filesystem \
   device=/dev/drbd1 directory=/var/www fstype=gfs2 \
   options=_netdev,nodiratime,noatime \
   op monitor interval=20 timeout=40 notify interval=0s timeout=60 start \
   interval=0s timeout=120s stop interval=0s timeout=120s
pcs -f tmp-cib.xml resource create WebServer ocf:heartbeat:apache \
   configfile=/etc/httpd/conf/httpd.conf
statusurl=http://localhost/server-status \
   op monitor interval=1min start interval=0s timeout=40s stop interval=0s \
   timeout=60s
pcs -f tmp-cib.xml resource create SharedRootCrons ocf:heartbeat:symlink \
   link=/etc/cron.d/root-shared target=/var/www/crons/root-shared \
   op monitor interval=60 timeout=15 start interval=0s timeout=15 stop \
   interval=0s timeout=15
pcs -f tmp-cib.xml resource create SharedUserCrons ocf:heartbeat:symlink \
   link=/etc/cron.d/User-shared target=/var/www/crons/User-shared \
   op monitor interval=60 timeout=15 start interval=0s timeout=15 stop \
   interval=0s timeout=15
pcs -f tmp-cib.xml resource create PrimaryUserCrons ocf:heartbeat:symlink \
   link=/etc/cron.d/User-server1 target=/var/www/crons/User-server1 \
   op monitor interval=60 timeout=15 start interval=0s timeout=15 stop \
   interval=0s timeout=15 meta resource-stickiness=0
pcs -f tmp-cib.xml \
   resource create SecondaryUserCrons ocf:heartbeat:symlink \
   link=/etc/cron.d/User-server2 target=/var/www/crons/User-server2 \
   op monitor interval=60 timeout=15 start interval=0s timeout=15 stop \
   interval=0s timeout=15 meta resource-stickiness=0
pcs -f tmp-cib.xml \
   resource clone dlm clone-max=2 clone-node-max=1 interleave=true
pcs -f tmp-cib.xml resource clone WWWMount interleave=true
pcs -f tmp-cib.xml resource clone WebServer interleave=true
pcs -f tmp-cib.xml resource clone SharedRootCrons interleave=true
pcs -f tmp-cib.xml resource clone SharedUserCrons interleave=true
pcs -f tmp-cib.xml \
   resource master DRBDClone DRBD master-node-max=1 clone-max=2
master-max=2 \
   interleave=true notify=true clone-node-max=1
pcs -f tmp-cib.xml \
   constraint colocation add dlm-clone with DRBDClone \
   id=colocation-dlm-clone-DRBDClone-INFINITY
pcs -f tmp-cib.xml constraint order promote DRBDClone \
   then dlm-clone id=order-DRBDClone-dlm-clone-mandatory
pcs -f tmp-cib.xml \
   constraint colocation add WWWMount-clone with dlm-clone \
   id=colocation-WWWMount-clone-dlm-clone-INFINITY
pcs -f tmp-cib.xml constraint order dlm-clone \
   then WWWMount-clone id=order-dlm-clone-WWWMount-clone-mandatory
pcs -f tmp-cib.xml \
   constraint colocation add WebServer-clone with WWWMount-clone \
   id=colocation-WebServer-clone-WWWMount-clone-INFINITY
pcs -f tmp-cib.xml constraint order WWWMount-clone \
   then WebServer-clone id=order-WWWMount-clone-WebServer-clone-mandatory
pcs -f tmp-cib.xml \
   constraint colocation add SharedRootCrons-clone with WWWMount-clone \
   id=colocation-SharedRootCrons-clone-WWWMount-clone-INFINITY
pcs -f tmp-cib.xml \
   constraint colocation add SharedUserCrons-clone with WWWMount-clone \
   id=colocation-SharedUserCrons-clone-WWWMount-clone-INFINITY
pcs -f tmp-cib.xml constraint order WWWMount-clone \
   then SharedRootCrons-clone \
   id=order-WWWMount-clone-SharedRootCrons-clone-mandatory
pcs -f tmp-cib.xml constraint order WWWMount-clone \
   then SharedUserCrons-clone \
   id=order-WWWMount-clone-SharedUserCrons-clone-mandatory
pcs -f tmp-cib.xml \
   constraint location PrimaryUserCrons prefers node1.mydomain.com=500
pcs -f tmp-cib.xml \
   constraint colocation add PrimaryUserCrons with WWWMount-clone \
   id=colocation-PrimaryUserCrons-WWWMount-clone-INFINITY
pcs -f tmp-cib.xml constraint order WWWMount-clone \
   then PrimaryUserCrons \
   id=order-WWWMount-clone-PrimaryUserCrons-mandatory
pcs -f tmp-cib.xml \
   constraint location SecondaryUserCrons prefers node2.mydomain.com=500


I can't answer your question, but just observation - it appears only
resources with explicit location preferences misbehave. Is it possible
as workaround to not use them?

I suppose it's not *critical* that PrimaryCrons be on node1 and SecondaryCrons on node2; so long as during normal operation theyremain split. I could try something like negative colocation (?) to keep them separate, if nothing else to see if that allows themto bounce back and forth cleanly with regards to their other constraints. I'll give that a shot this morning.


Removed the two location constraints, and instead did:

pcs constraint colocation add PrimaryUserCrons with SecondaryUserCrons -500

Same result.

_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Non-cloned resource moves before cloned resource startup on unstandby

Reply via email to