Re: [ClusterLabs] corosync not able to form cluster

Christine Caulfield Fri, 08 Jun 2018 02:05:22 -0700

On 07/06/18 18:32, Prasad Nagaraj wrote:
> Hi Christine - Got it:)
> 
> I have collected few seconds of debug logs from all nodes after startup.
> Please find them attached.
> Please let me know if this will help us to identify rootcause.
>


The problem is on the node coro.4 - it never gets out of the JOIN

"Jun 07 16:55:37 corosync [TOTEM ] entering GATHER state from 11."

process so something is wrong on that node, either a rogue routing table
entry, dangling iptables rule or even a broken NIC.

Chrissie

> Thanks!
> 
> On Thu, Jun 7, 2018 at 8:43 PM, Christine Caulfield <ccaul...@redhat.com
> <mailto:ccaul...@redhat.com>> wrote:
> 
>     On 07/06/18 15:53, Prasad Nagaraj wrote:
>     > Hi - As you can see in the corosync.conf details - i have already kept
>     > debug: on
>     > 
> 
>     But only in the (disabled) AMF subsystem, not for corosync as a whole :)
> 
>         logger_subsys {
>         subsys: AMF
>         debug: on
>         }
> 
> 
>     Chrissie
> 
> 
>     > 
>     > On Thu, 7 Jun 2018, 8:03 pm Christine Caulfield, <ccaul...@redhat.com 
> <mailto:ccaul...@redhat.com>
>     > <mailto:ccaul...@redhat.com <mailto:ccaul...@redhat.com>>> wrote:
>     >
>     >     On 07/06/18 15:24, Prasad Nagaraj wrote:
>     >     >
>     >     > No iptables or otherwise firewalls are setup on these nodes.
>     >     >
>     >     > One observation is that each node sends messages on with its
>     own ring
>     >     > sequence number which is not converging.. I have seen that
>     in a good
>     >     > cluster, when nodes respond with same sequence number, the
>     >     membership is
>     >     > automatically formed. But in our case, that is not the case.
>     >     >
>     >
>     >     That's just a side-effect of the cluster not forming. It's not
>     causing
>     >     it. Can you enable full corosync debugging (just add debug:on
>     to the end
>     >     of the logging {} stanza) and see if that has any more useful
>     >     information (I only need the corosync bits, not the pcmk ones)
>     >
>     >     Chrissie
>     >
>     >     > Example: we can see that one node sends
>     >     > Jun 07 07:55:04 corosync [pcmk  ] notice: pcmk_peer_update:
>     >     Transitional
>     >     > membership event on ring 71084: memb=1, new=0, lost=0
>     >     > .....
>     >     > Jun 07 07:55:16 corosync [pcmk  ] notice: pcmk_peer_update:
>     >     Transitional
>     >     > membership event on ring 71096: memb=1, new=0, lost=0
>     >     > Jun 07 07:55:16 corosync [pcmk  ] notice: pcmk_peer_update:
>     Stable
>     >     > membership event on ring 71096: memb=1, new=0, lost=0
>     >     >
>     >     > other node sends messages with its own numbers
>     >     > Jun 07 07:55:12 corosync [pcmk  ] notice: pcmk_peer_update:
>     >     Transitional
>     >     > membership event on ring 71088: memb=1, new=0, lost=0
>     >     > Jun 07 07:55:12 corosync [pcmk  ] notice: pcmk_peer_update:
>     Stable
>     >     > membership event on ring 71088: memb=1, new=0, lost=0
>     >     > .......
>     >     > Jun 07 07:55:24 corosync [pcmk  ] notice: pcmk_peer_update:
>     >     Transitional
>     >     > membership event on ring 71100: memb=1, new=0, lost=0
>     >     > Jun 07 07:55:24 corosync [pcmk  ] notice: pcmk_peer_update:
>     Stable
>     >     > membership event on ring 71100: memb=1, new=0, lost=0
>     >     >
>     >     > Any idea why this happens, and why the seq. numbers from
>     different
>     >     nodes
>     >     > are not converging ?
>     >     >
>     >     > Thanks!
>     >     >
>     >     >
>     >     >
>     >     >
>     >     >
>     >     > _______________________________________________
>     >     > Users mailing list: Users@clusterlabs.org
>     <mailto:Users@clusterlabs.org>
>     >     <mailto:Users@clusterlabs.org <mailto:Users@clusterlabs.org>>
>     >     > https://lists.clusterlabs.org/mailman/listinfo/users
>     <https://lists.clusterlabs.org/mailman/listinfo/users>
>     >     >
>     >     > Project Home: http://www.clusterlabs.org
>     >     > Getting started:
>     >     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>     >     > Bugs: http://bugs.clusterlabs.org
>     >     >
>     > 
>     >     _______________________________________________
>     >     Users mailing list: Users@clusterlabs.org
>     <mailto:Users@clusterlabs.org> <mailto:Users@clusterlabs.org
>     <mailto:Users@clusterlabs.org>>
>     >     https://lists.clusterlabs.org/mailman/listinfo/users
>     <https://lists.clusterlabs.org/mailman/listinfo/users>
>     >
>     >     Project Home: http://www.clusterlabs.org
>     >     Getting started:
>     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>     >     Bugs: http://bugs.clusterlabs.org
>     >
>     >
>     >
>     > _______________________________________________
>     > Users mailing list: Users@clusterlabs.org
>     <mailto:Users@clusterlabs.org>
>     > https://lists.clusterlabs.org/mailman/listinfo/users
>     <https://lists.clusterlabs.org/mailman/listinfo/users>
>     >
>     > Project Home: http://www.clusterlabs.org
>     > Getting started:
>     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>     > Bugs: http://bugs.clusterlabs.org
>     >
> 
>     _______________________________________________
>     Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org>
>     https://lists.clusterlabs.org/mailman/listinfo/users
>     <https://lists.clusterlabs.org/mailman/listinfo/users>
> 
>     Project Home: http://www.clusterlabs.org
>     Getting started:
>     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>     Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> 
> _______________________________________________
> Users mailing list: Users@clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] corosync not able to form cluster

Reply via email to