Re: [ClusterLabs] Resources are stopped and started when one node rejoins

Vladislav Bogdanov Thu, 31 Aug 2017 05:39:01 -0700

31.08.2017 14:53, Octavian Ciobanu wrote:

I'm back with confirmation that DLM is the one that triggers Mount
resources to stop when the stopped/suspended node rejoins


When the DLM resource is started, on the rejoining node, it tries to get
a free journal but always gets one that is already occupied by another
node and triggers a domino effect, one node jumps to the next node
occupied dlm journal, stopping and restarting the mount resources.

I have plenty of DLM journals (allocated 10 for 3 node configuration) so
for sure there are unused ones available.

There is a way to make DLM keep its journal when the node is stopped and
use it when it starts ? or a way to make it keep it's active allocation
forcing the rejoining node to look for another unoccupied journal
without pushing away the nodes that occupy the journal it allocates for
the node that rejoins?


Did you try my suggestion from the previous e-mail?

'Probably you also do not need 'ordered="true"' for your DLM clone?Knowing what is DLM, it does not need ordering, its instances may besafely started in the parallel.'


Best regards

On Mon, Aug 28, 2017 at 6:04 PM, Octavian Ciobanu
<coctavian1...@gmail.com <mailto:coctavian1...@gmail.com>> wrote:

    Thank you for info.
    Looking over the output of the crm_simulate I've noticed the
    "notice" messages and with the help of the debug mode I've found
    this sequence in log

    Aug 28 16:23:19 [13802] node03 crm_simulate:    debug:
    native_assign_node:      Assigning node01 to DLM:2
    Aug 28 16:23:19 [13802] node03 crm_simulate:   notice:
    color_instance:  Pre-allocation failed: got node01 instead of node02
    Aug 28 16:23:19 [13802] node03 crm_simulate:     info:
    native_deallocate:       Deallocating DLM:2 from core01
    Aug 28 16:23:19 [13802] node03 crm_simulate:    debug:
    native_assign_node:      Assigning core01 to DLM:3
    Aug 28 16:23:19 [13802] node03 crm_simulate:   notice:
    color_instance:  Pre-allocation failed: got node01 instead of node03
    Aug 28 16:23:19 [13802] node03 crm_simulate:     info:
    native_deallocate:       Deallocating DLM:3 from core01
    Aug 28 16:23:19 [13802] node03 crm_simulate:    debug:
    native_assign_node:      Assigning core01 to DLM:0
    Aug 28 16:23:19 [13802] node03 crm_simulate:     info:
    rsc_merge_weights:       DLM:2: Rolling back scores from iSCSI2-clone
    Aug 28 16:23:19 [13802] node03 crm_simulate:     info:
    rsc_merge_weights:       DLM:2: Rolling back scores from iSCSI2-clone
    Aug 28 16:23:19 [13802] node03 crm_simulate:    debug:
    native_assign_node:      Assigning node03 to DLM:2
    Aug 28 16:23:19 [13802] node03 crm_simulate:     info:
    rsc_merge_weights:       DLM:3: Rolling back scores from iSCSI2-clone
    Aug 28 16:23:19 [13802] node03 crm_simulate:     info:
    rsc_merge_weights:       DLM:3: Rolling back scores from iSCSI2-clone
    Aug 28 16:23:19 [13802] node03 crm_simulate:    debug:
    native_assign_node:      Assigning node02 to DLM:3

    This suggests that the restarted node attempts to occupy a dlm
    journal that is allocated to another note and by doing so triggers a
    chain reaction leading to all resources being restarted on all nodes.

    I will try a different approach (based on your suggestions) on
    starting DLM, iSCSI and Mount resources and see if this changes
    anything.

    If based on the log you have any suggestions they are welcome.

    Thank you again for the help.

    On Mon, Aug 28, 2017 at 3:53 PM, Vladislav Bogdanov
    <bub...@hoster-ok.com <mailto:bub...@hoster-ok.com>> wrote:

        28.08.2017 14:03, Octavian Ciobanu wrote:

            Hey Vladislav,

            Thank you for the info. I've tried you suggestions but the
            behavior is
            still the same. When an offline/standby node rejoins the
            cluster all the
            resources are first stopped and then started. I've added the
            changes
            I've made, see below in reply message, next to your suggestions.


        Logs on DC (node where you see logs from the pengine process)
        should contain references to pe-input-XX.bz2 files. Something
        like "notice: Calculated transition XXXX, saving inputs in
        /var/lib/pacemaker/pengine/pe-input-XX.bz2"
        Locate one for which Stop actions occur.
        You can replay them with 'crm_simulate -S -x
        /var/lib/pacemaker/pengine/pe-input-XX.bz2' to see if that is
        the correct one (look in the middle of output).

        After that you may add some debugging:
        PCMK_debug=yes PCMK_logfile=./pcmk.log crm_simulate -S -x
        /var/lib/pacemaker/pengine/pe-input-XX.bz2

        That will produce a big file with all debugging messages enabled.

        Try to locate a reason for restarts there.

        Best,
        Vladislav

        Also please look inline (may be info there will be enough so you
        won't need to debug).


            Once again thank you for info.

            Best regards.
            Octavian Ciobanu

            On Sat, Aug 26, 2017 at 8:17 PM, Vladislav Bogdanov
            <bub...@hoster-ok.com <mailto:bub...@hoster-ok.com>
            <mailto:bub...@hoster-ok.com <mailto:bub...@hoster-ok.com>>>
            wrote:

                26.08.2017 19 <tel:26.08.2017%2019>
            <tel:26.08.2017%2019>:36, Octavian Ciobanu wrote:

                    Thank you for your reply.

                    There is no reason to set location for the
            resources, I think,
                    because
                    all the resources are set with clone options so they
            are started
                    on all
                    nodes at the same time.


                You still need to colocate "upper" resources with their
                dependencies. Otherwise pacemaker will try to start them
            even if
                their dependencies fail. Order without colocation has
            very limited
                use (usually when resources may run on different nodes).
            For clones
                that is even more exotic.


            I've added collocation

            pcs constraint colocation add iSCSI1-clone with DLM-clone
            pcs constraint colocation add iSCSI2-clone with DLM-clone
            pcs constraint colocation add iSCSI3-clone with DLM-clone
            pcs constraint colocation add Mount1-clone with iSCSI1-clone
            pcs constraint colocation add Mount2-clone with iSCSI2-clone
            pcs constraint colocation add Mount4-clone with iSCSI3-clone

            The result is the same ... all clones are first stopped and
            then started
            beginning with DLM resource and ending with the Mount ones.


        Yep, that was not meant to fix your problem. Just to prevent
        future issues.



                For you original question: ensure you have
            interleave=true set for
                all your clones. You seem to miss it for iSCSI ones.
                interleave=false (default) is for different uses (when upper
                resources require all clone instances to be up).


            Modified iSCSI resources and added interleave="true" and
            still no change
            in behavior.


        Weird... Probably you also do not need 'ordered="true"' for your
        DLM clone? Knowing what is DLM, it does not need ordering, its
        instances may be safely started in the parallel.



                Also, just a minor note, iSCSI resources do not actually
            depend on
                dlm, mounts should depend on it.


            I know but the mount resource must know when the iSCSI
            resource to whom
            is connected is started so the only solution I've seen was
            to place DLM
            before iSCSI and then Mount. If there is another solution, a
            proper way
            to do it, please can you give a reference or a place from
            where to read
            on how to do it ?


        You would want to colocate (and order) mount with both DLM and
        iSCSI. Multiple colocations/orders for the same resource are
        allowed.
        For mount you need DLM running and iSCSI disk connected. But you
        actually do not need DLM to connect iSCSI disk (so DLM and iSCSI
        resources may start in the parallel).



                    And when it comes to stickiness I forgot to
                    mention that but it set to 200. and also I have stonith
                    configured  to
                    use vmware esxi.

                    Best regards
                    Octavian Ciobanu

                    On Sat, Aug 26, 2017 at 6:16 PM, John Keates
            <j...@keates.nl <mailto:j...@keates.nl>
                    <mailto:j...@keates.nl <mailto:j...@keates.nl>>
                    <mailto:j...@keates.nl <mailto:j...@keates.nl>
            <mailto:j...@keates.nl <mailto:j...@keates.nl>>>> wrote:

                        While I am by no means a CRM/Pacemaker expert, I
            only see the
                        resource primitives and the order constraints.
            Wouldn’t you need
                        location and/or colocation as well as stickiness
            settings to
                    prevent
                        this from happening? What I think it might be
            doing is
                    seeing the
                        new node, then trying to move the resources (but
            not finding
                    it a
                        suitable target) and then moving them back where
            they came
                    from, but
                        fast enough for you to only see it as a restart.

                        If you crm_resource -P, it should also restart all
                    resources, but
                        put them in the preferred spot. If they end up
            in the same
                    place,
                        you probably didn’t put and weighing in the
            config or have
                        stickiness set to INF.

                        Kind regards,

                        John Keates

                            On 26 Aug 2017, at 14:23, Octavian Ciobanu
                            <coctavian1...@gmail.com
            <mailto:coctavian1...@gmail.com>
                        <mailto:coctavian1...@gmail.com
            <mailto:coctavian1...@gmail.com>>
                        <mailto:coctavian1...@gmail.com
            <mailto:coctavian1...@gmail.com>
                        <mailto:coctavian1...@gmail.com
            <mailto:coctavian1...@gmail.com>>>> wrote:

                            Hello all,

                            While playing with cluster configuration I
            noticed a strange
                            behavior. If I stop/standby cluster services
            on one node and
                            reboot it, when it joins the cluster all the
            resources
                        that were
                            started and working on active nodes get
            stopped and
                        restarted.

                            My testing configuration is based on 4
            nodes. One node is a
                            storage node that makes 3 iSCSI targets
            available for
                        the other
                            nodes to use,it is not configured to join
            cluster, and
                        three nodes
                            that are configured in a cluster using the
            following
                        commands.

                            pcs resource create DLM
            ocf:pacemaker:controld op monitor
                            interval="60" on-fail="fence" clone meta
            clone-max="3"
                            clone-node-max="1" interleave="true"
            ordered="true"
                            pcs resource create iSCSI1 ocf:heartbeat:iscsi
                            portal="10.0.0.1:3260 <http://10.0.0.1:3260>
            <http://10.0.0.1:3260>
                        <http://10.0.0.1:3260/>"
                            target="iqn.2017-08.example.com
            <http://iqn.2017-08.example.com>
                        <http://iqn.2017-08.example.com
            <http://iqn.2017-08.example.com>>
                            <http://iqn.2017-08.example.com
            <http://iqn.2017-08.example.com>
                        <http://iqn.2017-08.example.com
            <http://iqn.2017-08.example.com>>>:tgt1" op start interval="0"
                            timeout="20" op stop interval="0"
            timeout="20" op monitor
                            interval="120" timeout="30" clone meta
            clone-max="3"
                            clone-node-max="1"
                            pcs resource create iSCSI2 ocf:heartbeat:iscsi
                            portal="10.0.0.1:3260 <http://10.0.0.1:3260>
            <http://10.0.0.1:3260>
                        <http://10.0.0.1:3260/>"
                            target="iqn.2017-08.example.com
            <http://iqn.2017-08.example.com>
                        <http://iqn.2017-08.example.com
            <http://iqn.2017-08.example.com>>
                            <http://iqn.2017-08.example.com
            <http://iqn.2017-08.example.com>
                        <http://iqn.2017-08.example.com
            <http://iqn.2017-08.example.com>>>:tgt2" op start interval="0"
                            timeout="20" op stop interval="0"
            timeout="20" op monitor
                            interval="120" timeout="30" clone meta
            clone-max="3"
                            clone-node-max="1"
                            pcs resource create iSCSI3 ocf:heartbeat:iscsi
                            portal="10.0.0.1:3260 <http://10.0.0.1:3260>
            <http://10.0.0.1:3260>
                        <http://10.0.0.1:3260/>"
                            target="iqn.2017-08.example.com
            <http://iqn.2017-08.example.com>
                        <http://iqn.2017-08.example.com
            <http://iqn.2017-08.example.com>>
                            <http://iqn.2017-08.example.com
            <http://iqn.2017-08.example.com>

                        <http://iqn.2017-08.example.com
            <http://iqn.2017-08.example.com>>>:tgt3" op start interval="0"

                            timeout="20" op stop interval="0"
            timeout="20" op monitor
                            interval="120" timeout="30" clone meta
            clone-max="3"
                            clone-node-max="1"
                            pcs resource create Mount1
            ocf:heartbeat:Filesystem
                            device="/dev/disk/by-label/MyCluster:Data1"
                        directory="/mnt/data1"
                            fstype="gfs2"
            options="noatime,nodiratime,rw" op monitor
                            interval="90" on-fail="fence" clone meta
            clone-max="3"
                            clone-node-max="1" interleave="true"
                            pcs resource create Mount2
            ocf:heartbeat:Filesystem
                            device="/dev/disk/by-label/MyCluster:Data2"
                        directory="/mnt/data2"
                            fstype="gfs2"
            options="noatime,nodiratime,rw" op monitor
                            interval="90" on-fail="fence" clone meta
            clone-max="3"
                            clone-node-max="1" interleave="true"
                            pcs resource create Mount3
            ocf:heartbeat:Filesystem
                            device="/dev/disk/by-label/MyCluster:Data3"
                        directory="/mnt/data3"
                            fstype="gfs2"
            options="noatime,nodiratime,rw" op monitor
                            interval="90" on-fail="fence" clone meta
            clone-max="3"
                            clone-node-max="1" interleave="true"
                            pcs constraint order DLM-clone then iSCSI1-clone
                            pcs constraint order DLM-clone then iSCSI2-clone
                            pcs constraint order DLM-clone then iSCSI3-clone
                            pcs constraint order iSCSI1-clone then
            Mount1-clone
                            pcs constraint order iSCSI2-clone then
            Mount2-clone
                            pcs constraint order iSCSI3-clone then
            Mount3-clone

                            If I issue the command "pcs cluster standby
            node1" or
                        "pcs cluster
                            stop" on node 1 and after that I reboot the
            node. When
                        the node
                            gets back online (unstandby if it was put in
            standby
                        mode) all the
                            "MountX" resources get stopped on node 3 and
            4 and
                        started again.

                            Can anyone help me figure out where and what
            is the
                        mistake in my
                            configuration as I would like to keep the
            started
                        resources on
                            active nodes (avoid stop and start of
            resources)?

                            Thank you in advance
                            Octavian Ciobanu
                            _______________________________________________
                            Users mailing list: Users@clusterlabs.org
            <mailto:Users@clusterlabs.org>
                        <mailto:Users@clusterlabs.org
            <mailto:Users@clusterlabs.org>>
                            <mailto:Users@clusterlabs.org
            <mailto:Users@clusterlabs.org>
                        <mailto:Users@clusterlabs.org
            <mailto:Users@clusterlabs.org>>>

            http://lists.clusterlabs.org/mailman/listinfo/users
            <http://lists.clusterlabs.org/mailman/listinfo/users>

            <http://lists.clusterlabs.org/mailman/listinfo/users
            <http://lists.clusterlabs.org/mailman/listinfo/users>>

            <http://lists.clusterlabs.org/mailman/listinfo/users
            <http://lists.clusterlabs.org/mailman/listinfo/users>

            <http://lists.clusterlabs.org/mailman/listinfo/users
            <http://lists.clusterlabs.org/mailman/listinfo/users>>>

                            Project Home: http://www.clusterlabs.org
                            Getting started:

            http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>

            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>

            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>

            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>>
                            Bugs: http://bugs.clusterlabs.org



                        _______________________________________________
                        Users mailing list: Users@clusterlabs.org
            <mailto:Users@clusterlabs.org>
                    <mailto:Users@clusterlabs.org
            <mailto:Users@clusterlabs.org>>
            <mailto:Users@clusterlabs.org <mailto:Users@clusterlabs.org>

                    <mailto:Users@clusterlabs.org
            <mailto:Users@clusterlabs.org>>>

            http://lists.clusterlabs.org/mailman/listinfo/users
            <http://lists.clusterlabs.org/mailman/listinfo/users>
                    <http://lists.clusterlabs.org/mailman/listinfo/users
            <http://lists.clusterlabs.org/mailman/listinfo/users>>

            <http://lists.clusterlabs.org/mailman/listinfo/users
            <http://lists.clusterlabs.org/mailman/listinfo/users>
                    <http://lists.clusterlabs.org/mailman/listinfo/users
            <http://lists.clusterlabs.org/mailman/listinfo/users>>>

                        Project Home: http://www.clusterlabs.org
                        Getting started:

            http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>

            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>

            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>

            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>>
                        Bugs: http://bugs.clusterlabs.org




                    _______________________________________________
                    Users mailing list: Users@clusterlabs.org
            <mailto:Users@clusterlabs.org>
                    <mailto:Users@clusterlabs.org
            <mailto:Users@clusterlabs.org>>
                    http://lists.clusterlabs.org/mailman/listinfo/users
            <http://lists.clusterlabs.org/mailman/listinfo/users>
                    <http://lists.clusterlabs.org/mailman/listinfo/users
            <http://lists.clusterlabs.org/mailman/listinfo/users>>

                    Project Home: http://www.clusterlabs.org
                    Getting started:

            http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>

            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>
                    Bugs: http://bugs.clusterlabs.org



                _______________________________________________
                Users mailing list: Users@clusterlabs.org
            <mailto:Users@clusterlabs.org> <mailto:Users@clusterlabs.org
            <mailto:Users@clusterlabs.org>>
                http://lists.clusterlabs.org/mailman/listinfo/users
            <http://lists.clusterlabs.org/mailman/listinfo/users>
                <http://lists.clusterlabs.org/mailman/listinfo/users
            <http://lists.clusterlabs.org/mailman/listinfo/users>>

                Project Home: http://www.clusterlabs.org
                Getting started:
                http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
                <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>
                Bugs: http://bugs.clusterlabs.org




            _______________________________________________
            Users mailing list: Users@clusterlabs.org
            <mailto:Users@clusterlabs.org>
            http://lists.clusterlabs.org/mailman/listinfo/users
            <http://lists.clusterlabs.org/mailman/listinfo/users>

            Project Home: http://www.clusterlabs.org
            Getting started:
            http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
            <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
            Bugs: http://bugs.clusterlabs.org



        _______________________________________________
        Users mailing list: Users@clusterlabs.org
        <mailto:Users@clusterlabs.org>
        http://lists.clusterlabs.org/mailman/listinfo/users
        <http://lists.clusterlabs.org/mailman/listinfo/users>

        Project Home: http://www.clusterlabs.org
        Getting started:
        http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
        <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
        Bugs: http://bugs.clusterlabs.org





_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Resources are stopped and started when one node rejoins

Reply via email to