On 07/16/2016 04:12 PM, TEG AMJG wrote: > Dear list > I am quite new to PaceMaker and i am configuring a two node > active/active cluster which consist basically on something like this: > > My whole configuration is this one: > > Stack: corosync > Current DC: pbx2vs3 (version 1.1.13-10.el7_2.2-44eb2dd) - partition > with quorum > 2 nodes and 10 resources configured > > Online: [ pbx1vs3 pbx2vs3 ] > > Full list of resources: > > Clone Set: dlm-clone [dlm] > Started: [ pbx1vs3 pbx2vs3 ] > Clone Set: asteriskfs-clone [asteriskfs] > Started: [ pbx1vs3 pbx2vs3 ] > Clone Set: asterisk-clone [asterisk] > Started: [ pbx1vs3 pbx2vs3 ] > fence_pbx2_xvm (stonith:fence_xvm): Started pbx2vs3 > fence_pbx1_xvm (stonith:fence_xvm): Started pbx1vs3 > Clone Set: clvmd-clone [clvmd] > Started: [ pbx1vs3 pbx2vs3 ] > > PCSD Status: > pbx1vs3: Online > pbx2vs3: Online > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > [root@pbx1 ~]# pcs config show > Cluster Name: asteriskcluster > Corosync Nodes: > pbx1vs3 pbx2vs3 > Pacemaker Nodes: > pbx1vs3 pbx2vs3 > > Resources: > Clone: dlm-clone > Meta Attrs: clone-max=2 clone-node-max=1 interleave=true > Resource: dlm (class=ocf provider=pacemaker type=controld) > Attributes: allow_stonith_disabled=false > Operations: start interval=0s timeout=90 (dlm-start-interval-0s) > stop interval=0s on-fail=fence (dlm-stop-interval-0s) > monitor interval=60s on-fail=fence > (dlm-monitor-interval-60s) > Clone: asteriskfs-clone > Meta Attrs: interleave=true clone-max=2 clone-node-max=1 > Resource: asteriskfs (class=ocf provider=heartbeat type=Filesystem) > Attributes: device=/dev/vg_san1/lv_pbx directory=/mnt/asterisk > fstype=gfs2 > Operations: start interval=0s timeout=60 (asteriskfs-start-interval-0s) > stop interval=0s on-fail=fence > (asteriskfs-stop-interval-0s) > monitor interval=60s on-fail=fence > (asteriskfs-monitor-interval-60s) > Clone: asterisk-clone > Meta Attrs: interleaved=true > sipp_monitor=/root/scripts/haasterisk.sh > sipp_binary=/usr/local/src/sipp-3.4.1/bin/sipp globally-unique=false > ordered=false interleave=true clone-max=2 clone-node-max=1 notify=true > Resource: asterisk (class=ocf provider=heartbeat type=asterisk) > Attributes: user=root group=root > config=/mnt/asterisk/etc/asterisk.conf > sipp_monitor=/root/scripts/haasterisk.sh > sipp_binary=/usr/local/src/sipp-3.4.1/bin/sipp maxfiles=65535 > Operations: start interval=0s timeout=40s (asterisk-start-interval-0s) > stop interval=0s on-fail=fence (asterisk-stop-interval-0s) > monitor interval=10s (asterisk-monitor-interval-10s) > Clone: clvmd-clone > Meta Attrs: clone-max=2 clone-node-max=1 interleave=true > Resource: clvmd (class=ocf provider=heartbeat type=clvm) > Operations: start interval=0s timeout=90 (clvmd-start-interval-0s) > monitor interval=30s on-fail=fence > (clvmd-monitor-interval-30s) > stop interval=0s on-fail=fence (clvmd-stop-interval-0s) > > Stonith Devices: > Resource: fence_pbx2_xvm (class=stonith type=fence_xvm) > Attributes: port=tegamjg_pbx2 pcmk_host_list=pbx2vs3 > Operations: monitor interval=60s (fence_pbx2_xvm-monitor-interval-60s) > Resource: fence_pbx1_xvm (class=stonith type=fence_xvm) > Attributes: port=tegamjg_pbx1 pcmk_host_list=pbx1vs3 > Operations: monitor interval=60s (fence_pbx1_xvm-monitor-interval-60s) > Fencing Levels: > > Location Constraints: > Ordering Constraints: > start fence_pbx1_xvm then start fence_pbx2_xvm (kind:Mandatory) > (id:order-fence_pbx1_xvm-fence_pbx2_xvm-mandatory) > start fence_pbx2_xvm then start dlm-clone (kind:Mandatory) > (id:order-fence_pbx2_xvm-dlm-clone-mandatory) > start dlm-clone then start clvmd-clone (kind:Mandatory) > (id:order-dlm-clone-clvmd-clone-mandatory) > start clvmd-clone then start asteriskfs-clone (kind:Mandatory) > (id:order-clvmd-clone-asteriskfs-clone-mandatory) > start asteriskfs-clone then start asterisk-clone (kind:Mandatory) > (id:order-asteriskfs-clone-asterisk-clone-mandatory) > Colocation Constraints: > clvmd-clone with dlm-clone (score:INFINITY) > (id:colocation-clvmd-clone-dlm-clone-INFINITY) > asteriskfs-clone with clvmd-clone (score:INFINITY) > (id:colocation-asteriskfs-clone-clvmd-clone-INFINITY) > asterisk-clone with asteriskfs-clone (score:INFINITY) > (id:colocation-asterisk-clone-asteriskfs-clone-INFINITY) > > Resources Defaults: > migration-threshold: 2 > failure-timeout: 10m > start-failure-is-fatal: false > Operations Defaults: > No defaults set > > Cluster Properties: > cluster-infrastructure: corosync > cluster-name: asteriskcluster > dc-version: 1.1.13-10.el7_2.2-44eb2dd > have-watchdog: false > last-lrm-refresh: 1468598829 > no-quorum-policy: ignore > stonith-action: reboot > stonith-enabled: true > > Now my problem is that, for example, when i fence one of the nodes, > the other one restarts every clone resource and start them back again, > same thing happens when i stop pacemaker and corosync in one node only > (pcs cluster stop). That would mean that if i have a problem in one of > my Asterisk (for example in DLM resource or CLVMD) that would require > fencing right away, for example node pbx2vs3, the other node (pbx1vs3) > will restart every service which will drop all my calls in a well > functioning node. To be even more general, this happens every time a > resource needs stop/start or restart on any node it requires to be > done on every node in the cluster.
Guess this behavior is due to the order-constraints you defined for the stonith-resources. You probably have one of them running on each node if everything is fine and when you remove a node one of the stonith-resources is gone - everything else depends on that - so everything is shut down - the stonith-resource is moved - everything is started again. Why do you have separate resources for fencing the nodes? fence_xvm can be used for a list of nodes. You should be able to clone the stonith-resources as well so that you have one that can fence both nodes on each of the nodes. > > All this leads to a basic question, is this a strict way for clone > resources to behave?, is it possible to configure them so they would > behave, dare i say, in a more unique way (i know about the option > globally-unique but as far as i understand that doesnt do the work). I > have been reading about clone resources for a while but there are no > many examples about what it cant do. > > There are some meta operations that doesnt make sense, sorry about > that, the problem is that i dont know how to delete them with PCSD :). > Now, I found something interesting about constraint ordering with > clone resources in "Pacemaker Explained" documentation, which > describes something like this: > / > "<constraints> > <rsc_location id="clone-prefers-node1" rsc="apache-clone" node="node1" > score="500"/> > <rsc_colocation id="stats-with-clone" rsc="apache-stats" > with="apache-clone"/> > <rsc_order id="start-clone-then-stats" first="apache-clone" > then="apache-stats"/> > </constraints>" > > "Ordering constraints behave slightly differently for clones. In the > example above, apache-stats will > wait until all copies of apache-clone that need to be started have > done so before being started itself. > Only if no copies can be started will apache-stats be prevented from > being active. Additionally, the > clone will wait for apache-stats to be stopped before stopping itself". > > / > I am not sure if that has something to do with it, but i cannot > destroy the whole cluster to test it and probably in vain. > > Thank you very much. Regards > > Alejandro > > > _______________________________________________ > Users mailing list: [email protected] > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
