On Wed, Apr 13, 2016, at 12:36 PM, Ken Gaillot wrote: > On 04/13/2016 11:23 AM, Christopher Harvey wrote: > > I have a 3 node cluster (see the bottom of this email for 'pcs config' > > output) with 3 nodes. The MsgBB-Active and AD-Active service both flap > > whenever a node joins or leaves the cluster. I trigger the leave and > > join with a pacemaker service start and stop on any node. > > That's the default behavior of clones used in ordering constraints. If > you set interleave=true on your clones, each dependent clone instance > will only care about the depended-on instances on its own node, rather > than all nodes. > > See > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_clone_options > > While the interleave=true behavior is much more commonly used, > interleave=false is the default because it's safer -- the cluster > doesn't know anything about the cloned service, so it can't assume the > service is OK with it. Since you know what your service does, you can > set interleave=true for services that can handle it.
Hi Ken, Thanks for pointing out that attribute to me. I applied it as follows: Clone: Router-clone Meta Attrs: clone-max=2 clone-node-max=1 interleave=true Resource: Router (class=ocf provider=solace type=Router) Meta Attrs: migration-threshold=1 failure-timeout=1s Operations: start interval=0s timeout=2 (Router-start-interval-0s) stop interval=0s timeout=2 (Router-stop-interval-0s) monitor interval=1s (Router-monitor-interval-1s) It doesn't seems to change the behavior. Moreover, I found that I can start/stop the pacemaker instance on the vmr-123-5 node and produce the same flap on the MsgBB-Active resource on vmr-132-3 node. The Router clones are never shutdown or started. I would have thought if everything else in the cluster is constant, vmr-132-5 could never affect resources on the other two. > > Here is the happy steady state setup: > > > > 3 nodes and 4 resources configured > > > > Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ] > > > > Clone Set: Router-clone [Router] > > Started: [ vmr-132-3 vmr-132-4 ] > > MsgBB-Active (ocf::solace:MsgBB-Active): Started vmr-132-3 > > AD-Active (ocf::solace:AD-Active): Started vmr-132-3 > > > > [root@vmr-132-4 ~]# supervisorctl stop pacemaker > > no change, except vmr-132-4 goes offline > > [root@vmr-132-4 ~]# supervisorctl start pacemaker > > vmr-132-4 comes back online > > MsgBB-Active and AD-Active flap very quickly (<1s) > > Steady state is resumed. > > > > Why should the fact that vmr-132-4 coming and going affect the service > > on any other node? > > > > Thanks, > > Chris > > > > Cluster Name: > > Corosync Nodes: > > 192.168.132.5 192.168.132.4 192.168.132.3 > > Pacemaker Nodes: > > vmr-132-3 vmr-132-4 vmr-132-5 > > > > Resources: > > Clone: Router-clone > > Meta Attrs: clone-max=2 clone-node-max=1 > > Resource: Router (class=ocf provider=solace type=Router) > > Meta Attrs: migration-threshold=1 failure-timeout=1s > > Operations: start interval=0s timeout=2 (Router-start-timeout-2) > > stop interval=0s timeout=2 (Router-stop-timeout-2) > > monitor interval=1s (Router-monitor-interval-1s) > > Resource: MsgBB-Active (class=ocf provider=solace type=MsgBB-Active) > > Meta Attrs: migration-threshold=2 failure-timeout=1s > > Operations: start interval=0s timeout=2 (MsgBB-Active-start-timeout-2) > > stop interval=0s timeout=2 (MsgBB-Active-stop-timeout-2) > > monitor interval=1s (MsgBB-Active-monitor-interval-1s) > > Resource: AD-Active (class=ocf provider=solace type=AD-Active) > > Meta Attrs: migration-threshold=2 failure-timeout=1s > > Operations: start interval=0s timeout=2 (AD-Active-start-timeout-2) > > stop interval=0s timeout=2 (AD-Active-stop-timeout-2) > > monitor interval=1s (AD-Active-monitor-interval-1s) > > > > Stonith Devices: > > Fencing Levels: > > > > Location Constraints: > > Resource: AD-Active > > Disabled on: vmr-132-5 (score:-INFINITY) (id:ADNotOnMonitor) > > Resource: MsgBB-Active > > Enabled on: vmr-132-4 (score:100) (id:vmr-132-4Priority) > > Enabled on: vmr-132-3 (score:250) (id:vmr-132-3Priority) > > Disabled on: vmr-132-5 (score:-INFINITY) (id:MsgBBNotOnMonitor) > > Resource: Router-clone > > Disabled on: vmr-132-5 (score:-INFINITY) (id:RouterNotOnMonitor) > > Ordering Constraints: > > Resource Sets: > > set Router-clone MsgBB-Active sequential=true > > (id:pcs_rsc_set_Router-clone_MsgBB-Active) setoptions kind=Mandatory > > (id:pcs_rsc_order_Router-clone_MsgBB-Active) > > set MsgBB-Active AD-Active sequential=true > > (id:pcs_rsc_set_MsgBB-Active_AD-Active) setoptions kind=Mandatory > > (id:pcs_rsc_order_MsgBB-Active_AD-Active) > > Colocation Constraints: > > MsgBB-Active with Router-clone (score:INFINITY) > > (id:colocation-MsgBB-Active-Router-clone-INFINITY) > > AD-Active with MsgBB-Active (score:1000) > > (id:colocation-AD-Active-MsgBB-Active-1000) > > > > Resources Defaults: > > No defaults set > > Operations Defaults: > > No defaults set > > > > Cluster Properties: > > cluster-infrastructure: corosync > > cluster-recheck-interval: 1s > > dc-version: 1.1.13-10.el7_2.2-44eb2dd > > have-watchdog: false > > maintenance-mode: false > > start-failure-is-fatal: false > > stonith-enabled: false > > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org