Re: [ClusterLabs] Resources are stopped and started when one node rejoins

Vladislav Bogdanov Sat, 26 Aug 2017 10:21:05 -0700

26.08.2017 19:36, Octavian Ciobanu wrote:

Thank you for your reply.


There is no reason to set location for the resources, I think, because
all the resources are set with clone options so they are started on all
nodes at the same time.

You still need to colocate "upper" resources with their dependencies.Otherwise pacemaker will try to start them even if their dependenciesfail. Order without colocation has very limited use (usually whenresources may run on different nodes). For clones that is even more exotic.

For you original question: ensure you have interleave=true set for allyour clones. You seem to miss it for iSCSI ones. interleave=false(default) is for different uses (when upper resources require all cloneinstances to be up).

Also, just a minor note, iSCSI resources do not actually depend on dlm,mounts should depend on it.

And when it comes to stickiness I forgot to
mention that but it set to 200. and also I have stonith configured  to
use vmware esxi.

Best regards
Octavian Ciobanu

On Sat, Aug 26, 2017 at 6:16 PM, John Keates <j...@keates.nl
<mailto:j...@keates.nl>> wrote:

    While I am by no means a CRM/Pacemaker expert, I only see the
    resource primitives and the order constraints. Wouldn’t you need
    location and/or colocation as well as stickiness settings to prevent
    this from happening? What I think it might be doing is seeing the
    new node, then trying to move the resources (but not finding it a
    suitable target) and then moving them back where they came from, but
    fast enough for you to only see it as a restart.

    If you crm_resource -P, it should also restart all resources, but
    put them in the preferred spot. If they end up in the same place,
    you probably didn’t put and weighing in the config or have
    stickiness set to INF.

    Kind regards,

    John Keates

    On 26 Aug 2017, at 14:23, Octavian Ciobanu
    <coctavian1...@gmail.com <mailto:coctavian1...@gmail.com>> wrote:

    Hello all,

    While playing with cluster configuration I noticed a strange
    behavior. If I stop/standby cluster services on one node and
    reboot it, when it joins the cluster all the resources that were
    started and working on active nodes get stopped and restarted.

    My testing configuration is based on 4 nodes. One node is a
    storage node that makes 3 iSCSI targets available for the other
    nodes to use,it is not configured to join cluster, and three nodes
    that are configured in a cluster using the following commands.

    pcs resource create DLM ocf:pacemaker:controld op monitor
    interval="60" on-fail="fence" clone meta clone-max="3"
    clone-node-max="1" interleave="true" ordered="true"
    pcs resource create iSCSI1 ocf:heartbeat:iscsi
    portal="10.0.0.1:3260 <http://10.0.0.1:3260/>"
    target="iqn.2017-08.example.com
    <http://iqn.2017-08.example.com>:tgt1" op start interval="0"
    timeout="20" op stop interval="0" timeout="20" op monitor
    interval="120" timeout="30" clone meta clone-max="3"
    clone-node-max="1"
    pcs resource create iSCSI2 ocf:heartbeat:iscsi
    portal="10.0.0.1:3260 <http://10.0.0.1:3260/>"
    target="iqn.2017-08.example.com
    <http://iqn.2017-08.example.com>:tgt2" op start interval="0"
    timeout="20" op stop interval="0" timeout="20" op monitor
    interval="120" timeout="30" clone meta clone-max="3"
    clone-node-max="1"
    pcs resource create iSCSI3 ocf:heartbeat:iscsi
    portal="10.0.0.1:3260 <http://10.0.0.1:3260/>"
    target="iqn.2017-08.example.com
    <http://iqn.2017-08.example.com>:tgt3" op start interval="0"
    timeout="20" op stop interval="0" timeout="20" op monitor
    interval="120" timeout="30" clone meta clone-max="3"
    clone-node-max="1"
    pcs resource create Mount1 ocf:heartbeat:Filesystem
    device="/dev/disk/by-label/MyCluster:Data1" directory="/mnt/data1"
    fstype="gfs2" options="noatime,nodiratime,rw" op monitor
    interval="90" on-fail="fence" clone meta clone-max="3"
    clone-node-max="1" interleave="true"
    pcs resource create Mount2 ocf:heartbeat:Filesystem
    device="/dev/disk/by-label/MyCluster:Data2" directory="/mnt/data2"
    fstype="gfs2" options="noatime,nodiratime,rw" op monitor
    interval="90" on-fail="fence" clone meta clone-max="3"
    clone-node-max="1" interleave="true"
    pcs resource create Mount3 ocf:heartbeat:Filesystem
    device="/dev/disk/by-label/MyCluster:Data3" directory="/mnt/data3"
    fstype="gfs2" options="noatime,nodiratime,rw" op monitor
    interval="90" on-fail="fence" clone meta clone-max="3"
    clone-node-max="1" interleave="true"
    pcs constraint order DLM-clone then iSCSI1-clone
    pcs constraint order DLM-clone then iSCSI2-clone
    pcs constraint order DLM-clone then iSCSI3-clone
    pcs constraint order iSCSI1-clone then Mount1-clone
    pcs constraint order iSCSI2-clone then Mount2-clone
    pcs constraint order iSCSI3-clone then Mount3-clone

    If I issue the command "pcs cluster standby node1" or "pcs cluster
    stop" on node 1 and after that I reboot the node. When the node
    gets back online (unstandby if it was put in standby mode) all the
    "MountX" resources get stopped on node 3 and 4 and started again.

    Can anyone help me figure out where and what is the mistake in my
    configuration as I would like to keep the started resources on
    active nodes (avoid stop and start of resources)?

    Thank you in advance
    Octavian Ciobanu
    _______________________________________________
    Users mailing list: Users@clusterlabs.org
    <mailto:Users@clusterlabs.org>
    http://lists.clusterlabs.org/mailman/listinfo/users
    <http://lists.clusterlabs.org/mailman/listinfo/users>

    Project Home: http://www.clusterlabs.org
    Getting started:
    http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
    <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
    Bugs: http://bugs.clusterlabs.org



    _______________________________________________
    Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org>
    http://lists.clusterlabs.org/mailman/listinfo/users
    <http://lists.clusterlabs.org/mailman/listinfo/users>

    Project Home: http://www.clusterlabs.org
    Getting started:
    http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
    <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
    Bugs: http://bugs.clusterlabs.org




_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Resources are stopped and started when one node rejoins

Reply via email to