Re: [ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)

Jan Friesse Wed, 05 Oct 2016 00:04:05 -0700

Martin,

Hello all,


I am trying to understand why the following 2 Corosync heartbeat ring failure
scenarios
I have been testing and hope somebody can explain why this makes any sense.


Consider the following cluster:

     * 3x Nodes: A, B and C
     * 2x NICs for each Node
     * Corosync 2.3.5 configured with "rrp_mode: passive" and
       udpu transport with ring id 0 and 1 on each node.
     * On each node "corosync-cfgtool -s" shows:
         [...] ring 0 active with no faults
         [...] ring 1 active with no faults


Consider the following scenarios:

     1. On node A only block all communication on the first NIC  configured with
ring id 0
     2. On node A only block all communication on all       NICs configured with
ring id 0 and 1


The result of the above scenarios is as follows:

     1. Nodes A, B and C (!) display the following ring status:
         [...] Marking ringid 0 interface <IP-Address> FAULTY
         [...] ring 1 active with no faults
     2. Node A is shown as OFFLINE - B and C display the following ring status:
         [...] ring 0 active with no faults
         [...] ring 1 active with no faults


Questions:
     1. Is this the expected outcome ?

Yes

     2. In experiment 1. B and C can still communicate with each other over both
NICs, so why are
        B and C not displaying a "no faults" status for ring id 0 and 1 just 
like
in experiment 2.

Because this is how RRP works. RRP marks whole ring as failed so everynode sees that ring as failed.

        when node A is completely unreachable ?

Because it's different scenario. In scenario 1 there are 3 nodesmembership where one of them has failed one ring -> whole ring isfailed. In scenario 2 there are 2 nodes membership where both ringsworks as expected. Node A is completely unreachable and it's not in themembership.


Regards,
  Honza



Regards,
Martin Schlegel

_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)

Reply via email to