----- On Jun 23, 2019, at 1:40 PM, Somanath Jeeva [email protected] 
wrote:

> Hi All,
> I have a two node cluster with multicast (udp) transport . The multicast IP 
> used
> in 224.1.1.1 .
> Whenever there is a CPU intensive task the pcs cluster goes into split brain
> scenario and doesn’t recover automatically . We have to do a manual restart of
> services to bring both nodes online again. Before the nodes goes into split
> brain , the corosync log shows ,
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:51:42 server1 corosync[4745]: [TOTEM ] A processor failed, forming 
> new
> configuration.
> May 24 16:41:42 server1 corosync[4745]: [TOTEM ] A new membership
> (10.241.31.12:29276) was formed. Members left: 1
> May 24 16:41:42 server1 corosync[4745]: [TOTEM ] Failed to receive the leave
> message. failed: 1
> Is there any way we can overcome this or this may be due to any multicast 
> issues
> in the network side.
> With Regards
> Somanath Thilak J

I have atop running on all of my systems. It helps debugging and 
troubleshooting a lot.
It should be available for all distros.
In /etc/atop/atop.daily (on SuSE-systems) i change "INTERVAL=1",
so use of resources is logged each second. Attention !
This creates big logfiles in /var/log/atop (Suse).
Between two or three Gigabytes.
Take care that you have enough storage or configure atop.daily so that only 
some logs are kept.

With atop you have a precise look what the system does in the seconds before 
fencing,
e.g. see which processes use much resources.

Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, 
Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to