On an active-active two node cluster with DRBD, dlm, filesystem mounts, a Web Server, and some crons I can't figure out how to have the crons jump from node to node in the correct order. Specifically, I have two crontabs (managed via symlink creation/deletion) which normally will run one on node1 and the other on node2. When a node goes down, I want both to run on the remaining node until the original node comes back up, at which time they should split the nodes again. However, when returning to the original node the crontab that is being moved must wait until the underlying FS mount is done on the original node before jumping.

DRBD, dlm, the filesystem mounts and the Web Server are all working as expected; when I mark the second node as standby Apache stops, the FS unmounts, dlm stops, and DRBD stops on the node; and when I mark that same node unstandby the reverse happens as expected. All three of those are cloned resources.

The crontab resources are not cloned and create symlinks, one resource preferring the first node and the other preferring the second. Each is colocated and order dependent on the filesystem mounts (which in turn are colocated and dependent on dlm, which in turn is colocated and dependent on DRBD promotion). I thought this would be sufficient, but when the original node is marked unstandby the crontab that prefers to be on that node attempts to jump over immediately before the FS is mounted on that node. Of course the crontab link fails because the underlying filesystem hasn't been mounted yet.

pcs version is 0.9.162.

Here's the obfuscated detailed list of commands for the config. I'm still trying to set it up so it's not production-ready yet, but want to get this much sorted before I add too much more.

# pcs config export pcs-commands
#!/usr/bin/sh
# sequence generated on 2018-09-07 15:21:15 with: clufter 0.77.0
# invoked as: ['/usr/sbin/pcs', 'config', 'export', 'pcs-commands']
# targeting system: ('linux', 'centos', '7.5.1804', 'Core')
# using interpreter: CPython 2.7.5
pcs cluster auth node1.mydomain.com node2.mydomain.com <> /dev/tty
pcs cluster setup --name MyCluster \
  node1.mydomain.com node2.mydomain.com --transport udpu
pcs cluster start --all --wait=60
pcs cluster cib tmp-cib.xml
cp tmp-cib.xml tmp-cib.xml.deltasrc
pcs -f tmp-cib.xml property set stonith-enabled=false
pcs -f tmp-cib.xml property set no-quorum-policy=freeze
pcs -f tmp-cib.xml resource defaults resource-stickiness=100
pcs -f tmp-cib.xml resource create DRBD ocf:linbit:drbd drbd_resource=r0 \
  op demote interval=0s timeout=90 monitor interval=60s notify interval=0s \
  timeout=90 promote interval=0s timeout=90 reload interval=0s timeout=30 \
  start interval=0s timeout=240 stop interval=0s timeout=100
pcs -f tmp-cib.xml resource create dlm ocf:pacemaker:controld \
  allow_stonith_disabled=1 \
  op monitor interval=60s start interval=0s timeout=90 stop interval=0s \
  timeout=100
pcs -f tmp-cib.xml resource create WWWMount ocf:heartbeat:Filesystem \
  device=/dev/drbd1 directory=/var/www fstype=gfs2 \
  options=_netdev,nodiratime,noatime \
  op monitor interval=20 timeout=40 notify interval=0s timeout=60 start \
  interval=0s timeout=120s stop interval=0s timeout=120s
pcs -f tmp-cib.xml resource create WebServer ocf:heartbeat:apache \
  configfile=/etc/httpd/conf/httpd.conf 
statusurl=http://localhost/server-status \
  op monitor interval=1min start interval=0s timeout=40s stop interval=0s \
  timeout=60s
pcs -f tmp-cib.xml resource create SharedRootCrons ocf:heartbeat:symlink \
  link=/etc/cron.d/root-shared target=/var/www/crons/root-shared \
  op monitor interval=60 timeout=15 start interval=0s timeout=15 stop \
  interval=0s timeout=15
pcs -f tmp-cib.xml resource create SharedUserCrons ocf:heartbeat:symlink \
  link=/etc/cron.d/User-shared target=/var/www/crons/User-shared \
  op monitor interval=60 timeout=15 start interval=0s timeout=15 stop \
  interval=0s timeout=15
pcs -f tmp-cib.xml resource create PrimaryUserCrons ocf:heartbeat:symlink \
  link=/etc/cron.d/User-server1 target=/var/www/crons/User-server1 \
  op monitor interval=60 timeout=15 start interval=0s timeout=15 stop \
  interval=0s timeout=15 meta resource-stickiness=0
pcs -f tmp-cib.xml \
  resource create SecondaryUserCrons ocf:heartbeat:symlink \
  link=/etc/cron.d/User-server2 target=/var/www/crons/User-server2 \
  op monitor interval=60 timeout=15 start interval=0s timeout=15 stop \
  interval=0s timeout=15 meta resource-stickiness=0
pcs -f tmp-cib.xml \
  resource clone dlm clone-max=2 clone-node-max=1 interleave=true
pcs -f tmp-cib.xml resource clone WWWMount interleave=true
pcs -f tmp-cib.xml resource clone WebServer interleave=true
pcs -f tmp-cib.xml resource clone SharedRootCrons interleave=true
pcs -f tmp-cib.xml resource clone SharedUserCrons interleave=true
pcs -f tmp-cib.xml \
  resource master DRBDClone DRBD master-node-max=1 clone-max=2 master-max=2 \
  interleave=true notify=true clone-node-max=1
pcs -f tmp-cib.xml \
  constraint colocation add dlm-clone with DRBDClone \
  id=colocation-dlm-clone-DRBDClone-INFINITY
pcs -f tmp-cib.xml constraint order promote DRBDClone \
  then dlm-clone id=order-DRBDClone-dlm-clone-mandatory
pcs -f tmp-cib.xml \
  constraint colocation add WWWMount-clone with dlm-clone \
  id=colocation-WWWMount-clone-dlm-clone-INFINITY
pcs -f tmp-cib.xml constraint order dlm-clone \
  then WWWMount-clone id=order-dlm-clone-WWWMount-clone-mandatory
pcs -f tmp-cib.xml \
  constraint colocation add WebServer-clone with WWWMount-clone \
  id=colocation-WebServer-clone-WWWMount-clone-INFINITY
pcs -f tmp-cib.xml constraint order WWWMount-clone \
  then WebServer-clone id=order-WWWMount-clone-WebServer-clone-mandatory
pcs -f tmp-cib.xml \
  constraint colocation add SharedRootCrons-clone with WWWMount-clone \
  id=colocation-SharedRootCrons-clone-WWWMount-clone-INFINITY
pcs -f tmp-cib.xml \
  constraint colocation add SharedUserCrons-clone with WWWMount-clone \
  id=colocation-SharedUserCrons-clone-WWWMount-clone-INFINITY
pcs -f tmp-cib.xml constraint order WWWMount-clone \
  then SharedRootCrons-clone \
  id=order-WWWMount-clone-SharedRootCrons-clone-mandatory
pcs -f tmp-cib.xml constraint order WWWMount-clone \
  then SharedUserCrons-clone \
  id=order-WWWMount-clone-SharedUserCrons-clone-mandatory
pcs -f tmp-cib.xml \
  constraint location PrimaryUserCrons prefers node1.mydomain.com=500
pcs -f tmp-cib.xml \
  constraint colocation add PrimaryUserCrons with WWWMount-clone \
  id=colocation-PrimaryUserCrons-WWWMount-clone-INFINITY
pcs -f tmp-cib.xml constraint order WWWMount-clone \
  then PrimaryUserCrons \
  id=order-WWWMount-clone-PrimaryUserCrons-mandatory
pcs -f tmp-cib.xml \
  constraint location SecondaryUserCrons prefers node2.mydomain.com=500
pcs -f tmp-cib.xml \
  constraint colocation add SecondaryUserCrons with WWWMount-clone \
  id=colocation-SecondaryUserCrons-WWWMount-clone-INFINITY
pcs -f tmp-cib.xml constraint order WWWMount-clone \
  then SecondaryUserCrons \
  id=order-WWWMount-clone-SecondaryUserCrons-mandatory
pcs cluster cib-push tmp-cib.xml diff-against=tmp-cib.xml.deltasrc

When I standby node2, the SecondaryUserCrons bounces over to node1 as expected. When I unstandby node2, it bounces back to node2 immediately, before WWWMount is performed, and thus it fails. What am I missing? Here are the log messages from the unstandby operation:

Sep  7 15:02:28 node2 crmd[58188]:   notice: State transition S_IDLE -> 
S_POLICY_ENGINE
Sep  7 15:02:28 node2 pengine[58187]:   notice:  * Start      DRBD:1            
     (                        node2.mydomain.com )
Sep 7 15:02:28 node2 pengine[58187]: notice: * Start dlm:1 ( node2.mydomain.com ) due to unrunnable DRBD:1 promote (blocked) Sep 7 15:02:28 node2 pengine[58187]: notice: * Start WWWMount:1 ( node2.mydomain.com ) due to unrunnable dlm:1 start (blocked) Sep 7 15:02:28 node2 pengine[58187]: notice: * Start WebServer:1 ( node2.mydomain.com ) due to unrunnable WWWMount:1 start (blocked) Sep 7 15:02:28 node2 pengine[58187]: notice: * Start SharedRootCrons:1 ( node2.mydomain.com ) due to unrunnable WWWMount:1 start (blocked) Sep 7 15:02:28 node2 pengine[58187]: notice: * Start SharedUserCrons:1 ( node2.mydomain.com ) due to unrunnable WWWMount:1 start (blocked)
Sep  7 15:02:28 node2 pengine[58187]:   notice:  * Move       SecondaryUserCrons   
  ( node1.mydomain.com -> node2.mydomain.com )
Sep  7 15:02:28 node2 pengine[58187]:   notice: Calculated transition 129, 
saving inputs in /var/lib/pacemaker/pengine/pe-input-2795.bz2
Sep  7 15:02:28 node2 crmd[58188]:   notice: Initiating stop operation 
SecondaryUserCrons_stop_0 on node1.mydomain.com
Sep  7 15:02:28 node2 crmd[58188]:   notice: Initiating notify operation 
DRBD_pre_notify_start_0 on node1.mydomain.com
Sep  7 15:02:28 node2 crmd[58188]:   notice: Initiating start operation 
SecondaryUserCrons_start_0 locally on node2.mydomain.com
Sep  7 15:02:28 node2 symlink(SecondaryUserCrons)[52196]: WARNING: 
/var/www/crons/User-server2 does not exist!
Sep  7 15:02:28 node2 crmd[58188]:   notice: Initiating start operation 
DRBD_start_0 locally on node2.mydomain.com
Sep  7 15:02:28 node2 symlink(SecondaryUserCrons)[52196]: INFO: 
'/etc/cron.d/User-server2' -> '/var/www/crons/User-server2'
Sep  7 15:02:28 node2 symlink(SecondaryUserCrons)[52196]: ERROR: 
/etc/cron.d/User-server2 does not point to /var/www/crons/User-server2!
Sep 7 15:02:28 node2 lrmd[58185]: notice: SecondaryUserCrons_start_0:52196:stderr [ ocf-exit-reason:/etc/cron.d/User-server2 does not point to /var/www/crons/User-server2! ]
Sep  7 15:02:28 node2 crmd[58188]:   notice: Result of start operation for 
SecondaryUserCrons on node2.mydomain.com: 5 (not installed)
Sep 7 15:02:28 node2 crmd[58188]: notice: node2.mydomain.com-SecondaryUserCrons_start_0:390 [ ocf-exit-reason:/etc/cron.d/User-server2 does not point to /var/www/crons/User-server2!\n ] Sep 7 15:02:28 node2 crmd[58188]: warning: Action 109 (SecondaryUserCrons_start_0) on node2.mydomain.com failed (target: 0 vs. rc: 5): Error Sep 7 15:02:28 node2 crmd[58188]: notice: Transition aborted by operation SecondaryUserCrons_start_0 'modify' on node2.mydomain.com: Event failed Sep 7 15:02:28 node2 crmd[58188]: warning: Action 109 (SecondaryUserCrons_start_0) on node2.mydomain.com failed (target: 0 vs. rc: 5): Error Sep 7 15:02:28 node2 crmd[58188]: notice: Transition aborted by status-2-fail-count-SecondaryUserCrons.start_0 doing create fail-count-SecondaryUserCrons#start_0=INFINITY: Transient attribute change
Sep  7 15:02:28 node2 kernel: drbd r0: Starting worker thread (from drbdsetup 
[52264])
Sep  7 15:02:28 node2 kernel: drbd r0/0 drbd1: disk( Diskless -> Attaching )
Sep  7 15:02:28 node2 kernel: drbd r0/0 drbd1: Maximum number of peer devices = 
1
Sep  7 15:02:28 node2 kernel: drbd r0: Method to ensure write ordering: drain
Sep  7 15:02:28 node2 kernel: drbd r0/0 drbd1: drbd_bm_resize called with 
capacity == 1048543928
Sep  7 15:02:28 node2 kernel: drbd r0/0 drbd1: resync bitmap: bits=131067991 
words=2047938 pages=4000
Sep  7 15:02:28 node2 kernel: drbd r0/0 drbd1: size = 500 GB (524271964 KB)
Sep  7 15:02:28 node2 kernel: drbd r0/0 drbd1: size = 500 GB (524271964 KB)
Sep  7 15:02:28 node2 kernel: drbd r0/0 drbd1: recounting of set bits took 
additional 13ms
Sep  7 15:02:28 node2 kernel: drbd r0/0 drbd1: disk( Attaching -> Outdated )
Sep  7 15:02:28 node2 kernel: drbd r0/0 drbd1: attached to current UUID: 
A2457506F4D44F1C
Sep  7 15:02:28 node2 kernel: drbd r0/1 drbd2: disk( Diskless -> Attaching )
Sep  7 15:02:28 node2 kernel: drbd r0/1 drbd2: Maximum number of peer devices = 
1
Sep  7 15:02:28 node2 kernel: drbd r0/1 drbd2: drbd_bm_resize called with 
capacity == 2097016
Sep  7 15:02:28 node2 kernel: drbd r0/1 drbd2: resync bitmap: bits=262127 
words=4096 pages=8
Sep  7 15:02:28 node2 kernel: drbd r0/1 drbd2: size = 1024 MB (1048508 KB)
Sep  7 15:02:28 node2 kernel: drbd r0/1 drbd2: size = 1024 MB (1048508 KB)
Sep  7 15:02:28 node2 kernel: drbd r0/1 drbd2: recounting of set bits took 
additional 0ms
Sep  7 15:02:28 node2 kernel: drbd r0/1 drbd2: disk( Attaching -> Outdated )
Sep  7 15:02:28 node2 kernel: drbd r0/1 drbd2: attached to current UUID: 
0EC5D56AEE53C6B6
Sep  7 15:02:28 node2 kernel: drbd r0 node1.mydomain.com: Starting sender 
thread (from drbdsetup [52291])
Sep  7 15:02:28 node2 kernel: drbd r0 node1.mydomain.com: conn( StandAlone -> 
Unconnected )
Sep  7 15:02:28 node2 kernel: drbd r0 node1.mydomain.com: Starting receiver 
thread (from drbd_w_r0 [52265])
Sep  7 15:02:28 node2 kernel: drbd r0 node1.mydomain.com: conn( Unconnected -> 
Connecting )
Sep  7 15:02:28 node2 crmd[58188]:   notice: Result of start operation for DRBD 
on node2.mydomain.com: 0 (ok)
Sep  7 15:02:28 node2 crmd[58188]:   notice: Initiating notify operation 
DRBD_post_notify_start_0 on node1.mydomain.com
Sep  7 15:02:28 node2 crmd[58188]:   notice: Initiating notify operation 
DRBD_post_notify_start_0 locally on node2.mydomain.com
Sep  7 15:02:28 node2 crmd[58188]:   notice: Result of notify operation for 
DRBD on node2.mydomain.com: 0 (ok)
Sep 7 15:02:28 node2 crmd[58188]: notice: Transition 129 (Complete=29, Pending=0, Fired=0, Skipped=1, Incomplete=7, Source=/var/lib/pacemaker/pengine/pe-input-2795.bz2): Stopped Sep 7 15:02:28 node2 pengine[58187]: warning: Processing failed op start for SecondaryUserCrons on node2.mydomain.com: not installed (5) Sep 7 15:02:28 node2 pengine[58187]: notice: Preventing SecondaryUserCrons from re-starting on node2.mydomain.com: operation start failed 'not installed' (5) Sep 7 15:02:28 node2 pengine[58187]: warning: Processing failed op start for SecondaryUserCrons on node2.mydomain.com: not installed (5) Sep 7 15:02:28 node2 pengine[58187]: notice: Preventing SecondaryUserCrons from re-starting on node2.mydomain.com: operation start failed 'not installed' (5) Sep 7 15:02:28 node2 pengine[58187]: warning: Forcing SecondaryUserCrons away from node2.mydomain.com after 1000000 failures (max=1000000) Sep 7 15:02:28 node2 pengine[58187]: notice: * Start dlm:1 ( node2.mydomain.com ) due to unrunnable DRBD:1 promote (blocked) Sep 7 15:02:28 node2 pengine[58187]: notice: * Start WWWMount:1 ( node2.mydomain.com ) due to unrunnable dlm:1 start (blocked) Sep 7 15:02:28 node2 pengine[58187]: notice: * Start WebServer:1 ( node2.mydomain.com ) due to unrunnable WWWMount:1 start (blocked) Sep 7 15:02:28 node2 pengine[58187]: notice: * Start SharedRootCrons:1 ( node2.mydomain.com ) due to unrunnable WWWMount:1 start (blocked) Sep 7 15:02:28 node2 pengine[58187]: notice: * Start SharedUserCrons:1 ( node2.mydomain.com ) due to unrunnable WWWMount:1 start (blocked)
Sep  7 15:02:28 node2 pengine[58187]:   notice:  * Recover    SecondaryUserCrons   
  ( node2.mydomain.com -> node1.mydomain.com )
Sep  7 15:02:28 node2 pengine[58187]:   notice: Calculated transition 130, 
saving inputs in /var/lib/pacemaker/pengine/pe-input-2796.bz2
Sep  7 15:02:28 node2 crmd[58188]:   notice: Initiating monitor operation 
DRBD_monitor_60000 locally on node2.mydomain.com
Sep  7 15:02:28 node2 crmd[58188]:   notice: Initiating stop operation 
SecondaryUserCrons_stop_0 locally on node2.mydomain.com
Sep  7 15:02:28 node2 symlink(SecondaryUserCrons)[52329]: WARNING: 
/var/www/crons/User-server2 does not exist!
Sep  7 15:02:28 node2 symlink(SecondaryUserCrons)[52329]: ERROR: 
/etc/cron.d/User-server2 does not point to /var/www/crons/User-server2!
Sep 7 15:02:28 node2 lrmd[58185]: notice: SecondaryUserCrons_stop_0:52329:stderr [ ocf-exit-reason:/etc/cron.d/User-server2 does not point to /var/www/crons/User-server2! ]
Sep  7 15:02:28 node2 crmd[58188]:   notice: Result of stop operation for 
SecondaryUserCrons on node2.mydomain.com: 5 (not installed)
Sep 7 15:02:28 node2 crmd[58188]: notice: node2.mydomain.com-SecondaryUserCrons_stop_0:394 [ ocf-exit-reason:/etc/cron.d/User-server2 does not point to /var/www/crons/User-server2!\n ] Sep 7 15:02:28 node2 crmd[58188]: warning: Action 10 (SecondaryUserCrons_stop_0) on node2.mydomain.com failed (target: 0 vs. rc: 5): Error Sep 7 15:02:28 node2 crmd[58188]: notice: Transition aborted by operation SecondaryUserCrons_stop_0 'modify' on node2.mydomain.com: Event failed Sep 7 15:02:28 node2 crmd[58188]: warning: Action 10 (SecondaryUserCrons_stop_0) on node2.mydomain.com failed (target: 0 vs. rc: 5): Error Sep 7 15:02:28 node2 crmd[58188]: notice: Transition aborted by status-2-fail-count-SecondaryUserCrons.stop_0 doing create fail-count-SecondaryUserCrons#stop_0=INFINITY: Transient attribute change Sep 7 15:02:28 node2 crmd[58188]: notice: Transition 130 (Complete=18, Pending=0, Fired=0, Skipped=0, Incomplete=8, Source=/var/lib/pacemaker/pengine/pe-input-2796.bz2): Complete Sep 7 15:02:29 node2 pengine[58187]: error: No further recovery can be attempted for SecondaryUserCrons: stop action failed with 'not installed' (5) Sep 7 15:02:29 node2 pengine[58187]: warning: Processing failed op stop for SecondaryUserCrons on node2.mydomain.com: not installed (5) Sep 7 15:02:29 node2 pengine[58187]: notice: Preventing SecondaryUserCrons from re-starting on node2.mydomain.com: operation stop failed 'not installed' (5) Sep 7 15:02:29 node2 pengine[58187]: error: No further recovery can be attempted for SecondaryUserCrons: stop action failed with 'not installed' (5) Sep 7 15:02:29 node2 pengine[58187]: warning: Processing failed op stop for SecondaryUserCrons on node2.mydomain.com: not installed (5) Sep 7 15:02:29 node2 pengine[58187]: notice: Preventing SecondaryUserCrons from re-starting on node2.mydomain.com: operation stop failed 'not installed' (5) Sep 7 15:02:29 node2 pengine[58187]: warning: Forcing SecondaryUserCrons away from node2.mydomain.com after 1000000 failures (max=1000000) Sep 7 15:02:29 node2 pengine[58187]: notice: * Start dlm:1 ( node2.mydomain.com ) due to unrunnable DRBD:1 promote (blocked) Sep 7 15:02:29 node2 pengine[58187]: notice: * Start WWWMount:1 ( node2.mydomain.com ) due to unrunnable dlm:1 start (blocked) Sep 7 15:02:29 node2 pengine[58187]: notice: * Start WebServer:1 ( node2.mydomain.com ) due to unrunnable WWWMount:1 start (blocked) Sep 7 15:02:29 node2 pengine[58187]: notice: * Start SharedRootCrons:1 ( node2.mydomain.com ) due to unrunnable WWWMount:1 start (blocked) Sep 7 15:02:29 node2 pengine[58187]: notice: * Start SharedUserCrons:1 ( node2.mydomain.com ) due to unrunnable WWWMount:1 start (blocked) Sep 7 15:02:29 node2 pengine[58187]: error: Calculated transition 131 (with errors), saving inputs in /var/lib/pacemaker/pengine/pe-error-26.bz2 Sep 7 15:02:29 node2 crmd[58188]: warning: Transition 131 (Complete=16, Pending=0, Fired=0, Skipped=0, Incomplete=5, Source=/var/lib/pacemaker/pengine/pe-error-26.bz2): Terminated
Sep  7 15:02:29 node2 crmd[58188]:  warning: Transition failed: terminated
Sep  7 15:02:29 node2 crmd[58188]:   notice: Graph 131 with 21 actions: 
batch-limit=0 jobs, network-delay=60000ms
Sep 7 15:02:29 node2 crmd[58188]: notice: [Action 47]: Completed pseudo op dlm-clone_running_0 on N/A (priority: 1000000, waiting: none) Sep 7 15:02:29 node2 crmd[58188]: notice: [Action 46]: Completed pseudo op dlm-clone_start_0 on N/A (priority: 0, waiting: none) Sep 7 15:02:29 node2 crmd[58188]: notice: [Action 55]: Completed pseudo op WWWMount-clone_running_0 on N/A (priority: 1000000, waiting: none) Sep 7 15:02:29 node2 crmd[58188]: notice: [Action 54]: Completed pseudo op WWWMount-clone_start_0 on N/A (priority: 0, waiting: none) Sep 7 15:02:29 node2 crmd[58188]: notice: [Action 69]: Pending rsc op WebServer_monitor_60000 on node2.mydomain.com (priority: 0, waiting: none)
Sep  7 15:02:29 node2 crmd[58188]:   notice:  * [Input 68]: Unresolved 
dependency rsc op WebServer_start_0 on node2.mydomain.com
Sep 7 15:02:29 node2 crmd[58188]: notice: [Action 71]: Completed pseudo op WebServer-clone_running_0 on N/A (priority: 1000000, waiting: none) Sep 7 15:02:29 node2 crmd[58188]: notice: [Action 70]: Completed pseudo op WebServer-clone_start_0 on N/A (priority: 0, waiting: none) Sep 7 15:02:29 node2 crmd[58188]: notice: [Action 93]: Pending rsc op SharedRootCrons_monitor_60000 on node2.mydomain.com (priority: 0, waiting: none)
Sep  7 15:02:29 node2 crmd[58188]:   notice:  * [Input 92]: Unresolved 
dependency rsc op SharedRootCrons_start_0 on node2.mydomain.com
Sep 7 15:02:29 node2 crmd[58188]: notice: [Action 95]: Completed pseudo op SharedRootCrons-clone_running_0 on N/A (priority: 1000000, waiting: none) Sep 7 15:02:29 node2 crmd[58188]: notice: [Action 94]: Completed pseudo op SharedRootCrons-clone_start_0 on N/A (priority: 0, waiting: none) Sep 7 15:02:29 node2 crmd[58188]: notice: [Action 101]: Pending rsc op SharedUserCrons_monitor_60000 on node2.mydomain.com (priority: 0, waiting: none)
Sep  7 15:02:29 node2 crmd[58188]:   notice:  * [Input 100]: Unresolved 
dependency rsc op SharedUserCrons_start_0 on node2.mydomain.com
Sep 7 15:02:29 node2 crmd[58188]: notice: [Action 103]: Completed pseudo op SharedUserCrons-clone_running_0 on N/A (priority: 1000000, waiting: none) Sep 7 15:02:29 node2 crmd[58188]: notice: [Action 102]: Completed pseudo op SharedUserCrons-clone_start_0 on N/A (priority: 0, waiting: none)
Sep  7 15:02:29 node2 crmd[58188]:   notice: State transition S_TRANSITION_ENGINE 
-> S_IDLE
Sep  7 15:02:29 node2 kernel: drbd r0 node1.mydomain.com: Handshake to peer 0 
successful: Agreed network protocol version 113
Sep 7 15:02:29 node2 kernel: drbd r0 node1.mydomain.com: Feature flags enabled on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
Sep  7 15:02:29 node2 kernel: drbd r0 node1.mydomain.com: Starting ack_recv 
thread (from drbd_r_r0 [52295])
Sep  7 15:02:29 node2 kernel: drbd r0 node1.mydomain.com: Preparing remote 
state change 2019156377
Sep  7 15:02:29 node2 kernel: drbd r0 node1.mydomain.com: Committing remote 
state change 2019156377 (primary_nodes=1)
Sep  7 15:02:29 node2 kernel: drbd r0 node1.mydomain.com: conn( Connecting -> 
Connected ) peer( Unknown -> Primary )
Sep  7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: 
drbd_sync_handshake:
Sep 7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: self A2457506F4D44F1C:0000000000000000:B13E5D392CF268C4:FE2F70857D64FB02 bits:0 flags:20 Sep 7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: peer D355B0F942665879:A2457506F4D44F1D:B13E5D392CF268C4:E56E164C51EEFAB0 bits:6 flags:120
Sep  7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: 
uuid_compare()=-2 by rule 50
Sep  7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: pdsk( DUnknown 
-> UpToDate ) repl( Off -> WFBitMapT )
Sep  7 15:02:29 node2 kernel: drbd r0/1 drbd2 node1.mydomain.com: 
drbd_sync_handshake:
Sep 7 15:02:29 node2 kernel: drbd r0/1 drbd2 node1.mydomain.com: self 0EC5D56AEE53C6B6:0000000000000000:0000000000000000:0000000000000000 bits:0 flags:20 Sep 7 15:02:29 node2 kernel: drbd r0/1 drbd2 node1.mydomain.com: peer 0EC5D56AEE53C6B6:0000000000000000:B62926494645765C:0000000000000000 bits:0 flags:120
Sep  7 15:02:29 node2 kernel: drbd r0/1 drbd2 node1.mydomain.com: 
uuid_compare()=0 by rule 38
Sep  7 15:02:29 node2 kernel: drbd r0/1 drbd2: disk( Outdated -> UpToDate )
Sep  7 15:02:29 node2 kernel: drbd r0/1 drbd2 node1.mydomain.com: pdsk( DUnknown 
-> UpToDate ) repl( Off -> Established )
Sep 7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 27(1), total 27; compression: 100.0% Sep 7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 27(1), total 27; compression: 100.0%
Sep  7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: helper 
command: /sbin/drbdadm before-resync-target
Sep  7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: helper 
command: /sbin/drbdadm before-resync-target exit code 0 (0x0)
Sep  7 15:02:29 node2 kernel: drbd r0/0 drbd1: disk( Outdated -> Inconsistent )
Sep  7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: repl( WFBitMapT 
-> SyncTarget )
Sep  7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: Began resync 
as SyncTarget (will sync 24 KB [6 bits set]).
Sep  7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: Resync done 
(total 1 sec; paused 0 sec; 24 K/sec)
Sep 7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: updated UUIDs D355B0F942665878:0000000000000000:A2457506F4D44F1C:E2BDB50A1BFBAE5E
Sep  7 15:02:29 node2 kernel: drbd r0/0 drbd1: disk( Inconsistent -> UpToDate )
Sep  7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: repl( SyncTarget 
-> Established )
Sep  7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: helper 
command: /sbin/drbdadm after-resync-target
Sep  7 15:02:29 node2 kernel: drbd r0/0 drbd1 node1.mydomain.com: helper 
command: /sbin/drbdadm after-resync-target exit code 0 (0x0)
Sep  7 15:03:29 node2 crmd[58188]:   notice: State transition S_IDLE -> 
S_POLICY_ENGINE
Sep 7 15:03:29 node2 pengine[58187]: error: No further recovery can be attempted for SecondaryUserCrons: stop action failed with 'not installed' (5) Sep 7 15:03:29 node2 pengine[58187]: warning: Processing failed op stop for SecondaryUserCrons on node2.mydomain.com: not installed (5) Sep 7 15:03:29 node2 pengine[58187]: notice: Preventing SecondaryUserCrons from re-starting on node2.mydomain.com: operation stop failed 'not installed' (5) Sep 7 15:03:29 node2 pengine[58187]: error: No further recovery can be attempted for SecondaryUserCrons: stop action failed with 'not installed' (5) Sep 7 15:03:29 node2 pengine[58187]: warning: Processing failed op stop for SecondaryUserCrons on node2.mydomain.com: not installed (5) Sep 7 15:03:29 node2 pengine[58187]: notice: Preventing SecondaryUserCrons from re-starting on node2.mydomain.com: operation stop failed 'not installed' (5) Sep 7 15:03:29 node2 pengine[58187]: warning: Forcing SecondaryUserCrons away from node2.mydomain.com after 1000000 failures (max=1000000)
Sep  7 15:03:29 node2 pengine[58187]:   notice:  * Promote    DRBD:1               
  (        Slave -> Master node2.mydomain.com )
Sep  7 15:03:29 node2 pengine[58187]:   notice:  * Start      dlm:1             
     (                        node2.mydomain.com )
Sep  7 15:03:29 node2 pengine[58187]:   notice:  * Start      WWWMount:1        
     (                        node2.mydomain.com )
Sep  7 15:03:29 node2 pengine[58187]:   notice:  * Start      WebServer:1       
     (                        node2.mydomain.com )
Sep  7 15:03:29 node2 pengine[58187]:   notice:  * Start      SharedRootCrons:1 
     (                        node2.mydomain.com )
Sep  7 15:03:29 node2 pengine[58187]:   notice:  * Start      SharedUserCrons:1 
     (                        node2.mydomain.com )
Sep 7 15:03:29 node2 pengine[58187]: error: Calculated transition 132 (with errors), saving inputs in /var/lib/pacemaker/pengine/pe-error-27.bz2
Sep  7 15:03:29 node2 crmd[58188]:   notice: Initiating cancel operation 
DRBD_monitor_60000 locally on node2.mydomain.com
Sep  7 15:03:29 node2 crmd[58188]:   notice: Initiating notify operation 
DRBD_pre_notify_promote_0 on node1.mydomain.com
Sep  7 15:03:29 node2 crmd[58188]:   notice: Initiating notify operation 
DRBD_pre_notify_promote_0 locally on node2.mydomain.com
Sep  7 15:03:29 node2 crmd[58188]:   notice: Result of notify operation for 
DRBD on node2.mydomain.com: 0 (ok)
Sep  7 15:03:29 node2 crmd[58188]:   notice: Initiating promote operation 
DRBD_promote_0 locally on node2.mydomain.com
Sep  7 15:03:29 node2 kernel: drbd r0: Preparing cluster-wide state change 
360863446 (1->-1 3/1)
Sep  7 15:03:29 node2 kernel: drbd r0: State change 360863446: primary_nodes=3, 
weak_nodes=FFFFFFFFFFFFFFFC
Sep  7 15:03:29 node2 kernel: drbd r0: Committing cluster-wide state change 
360863446 (0ms)
Sep  7 15:03:29 node2 kernel: drbd r0: role( Secondary -> Primary )
Sep  7 15:03:29 node2 crmd[58188]:   notice: Result of promote operation for 
DRBD on node2.mydomain.com: 0 (ok)
Sep  7 15:03:29 node2 crmd[58188]:   notice: Initiating notify operation 
DRBD_post_notify_promote_0 on node1.mydomain.com
Sep  7 15:03:29 node2 crmd[58188]:   notice: Initiating notify operation 
DRBD_post_notify_promote_0 locally on node2.mydomain.com
Sep  7 15:03:29 node2 crmd[58188]:   notice: Result of notify operation for 
DRBD on node2.mydomain.com: 0 (ok)
Sep  7 15:03:29 node2 crmd[58188]:   notice: Initiating start operation 
dlm_start_0 locally on node2.mydomain.com
Sep  7 15:03:29 node2 dlm_controld[53127]: 693403 dlm_controld 4.0.7 started
Sep  7 15:03:30 node2 crmd[58188]:   notice: Result of start operation for dlm 
on node2.mydomain.com: 0 (ok)
Sep  7 15:03:30 node2 crmd[58188]:   notice: Initiating monitor operation 
dlm_monitor_60000 locally on node2.mydomain.com
Sep  7 15:03:30 node2 crmd[58188]:   notice: Initiating start operation 
WWWMount_start_0 locally on node2.mydomain.com
Sep  7 15:03:30 node2 Filesystem(WWWMount)[53154]: INFO: Running start for 
/dev/drbd1 on /var/www
Sep  7 15:03:30 node2 kernel: dlm: Using TCP for communications
Sep  7 15:03:30 node2 kernel: GFS2: fsid=MyCluster:www: Trying to join cluster 
"lock_dlm", "MyCluster:www"
Sep  7 15:03:30 node2 kernel: dlm: connecting to 1
Sep  7 15:03:30 node2 kernel: dlm: got connection from 1
Sep  7 15:03:31 node2 kernel: GFS2: fsid=MyCluster:www: Joined cluster. Now 
mounting FS...
Sep  7 15:03:31 node2 kernel: GFS2: fsid=MyCluster:www.1: jid=1, already locked 
for use
Sep  7 15:03:31 node2 kernel: GFS2: fsid=MyCluster:www.1: jid=1: Looking at 
journal...
Sep  7 15:03:31 node2 kernel: GFS2: fsid=MyCluster:www.1: jid=1: Done
Sep  7 15:03:31 node2 crmd[58188]:   notice: Result of start operation for 
WWWMount on node2.mydomain.com: 0 (ok)
Sep  7 15:03:31 node2 crmd[58188]:   notice: Initiating monitor operation 
WWWMount_monitor_20000 locally on node2.mydomain.com
Sep  7 15:03:31 node2 crmd[58188]:   notice: Initiating start operation 
WebServer_start_0 locally on node2.mydomain.com
Sep  7 15:03:31 node2 crmd[58188]:   notice: Initiating start operation 
SharedRootCrons_start_0 locally on node2.mydomain.com
Sep  7 15:03:31 node2 crmd[58188]:   notice: Initiating start operation 
SharedUserCrons_start_0 locally on node2.mydomain.com
Sep  7 15:03:31 node2 symlink(SharedRootCrons)[53328]: INFO: 
'/etc/cron.d/root-shared' -> '/var/www/crons/root-shared'
Sep  7 15:03:31 node2 symlink(SharedUserCrons)[53329]: INFO: 
'/etc/cron.d/User-shared' -> '/var/www/crons/User-shared'
Sep  7 15:03:31 node2 crmd[58188]:   notice: Result of start operation for 
SharedRootCrons on node2.mydomain.com: 0 (ok)
Sep  7 15:03:31 node2 crmd[58188]:   notice: Result of start operation for 
SharedUserCrons on node2.mydomain.com: 0 (ok)
Sep  7 15:03:31 node2 crmd[58188]:   notice: Initiating monitor operation 
SharedRootCrons_monitor_60000 locally on node2.mydomain.com
Sep  7 15:03:31 node2 crmd[58188]:   notice: Initiating monitor operation 
SharedUserCrons_monitor_60000 locally on node2.mydomain.com
Sep  7 15:03:31 node2 apache(WebServer)[53325]: INFO: apache not running
Sep  7 15:03:31 node2 apache(WebServer)[53325]: INFO: waiting for apache 
/etc/httpd/conf/httpd.conf to come up
Sep  7 15:03:32 node2 crmd[58188]:   notice: Result of start operation for 
WebServer on node2.mydomain.com: 0 (ok)
Sep  7 15:03:32 node2 crmd[58188]:   notice: Initiating monitor operation 
WebServer_monitor_60000 locally on node2.mydomain.com
Sep 7 15:03:33 node2 crmd[58188]: notice: Transition 132 (Complete=44, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-error-27.bz2): Complete
Sep  7 15:03:33 node2 crmd[58188]:   notice: State transition S_TRANSITION_ENGINE 
-> S_IDLE



_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to