Re: [ClusterLabs] why is node fenced ?

Chris Walker Mon, 12 Aug 2019 10:47:12 -0700

When ha-idg-1 started Pacemaker around 17:43, it did not see ha-idg-2, for 
example,


Aug 09 17:43:05 [6318] ha-idg-1 pacemakerd:     info: pcmk_quorum_notification: 
Quorum retained | membership=1320 members=1

after ~20s (dc-deadtime parameter), ha-idg-2 is marked 'unclean' and STONITHed 
as part of startup fencing.

There is nothing in ha-idg-2's HA logs around 17:43 indicating that it saw 
ha-idg-1 either, so it appears that there was no communication at all between 
the two nodes.

I'm not sure exactly why the nodes did not see one another, but there are 
indications of network issues around this time

2019-08-09T17:42:16.427947+02:00 ha-idg-2 kernel: [ 1229.245533] bond1: now 
running without any active interface!

so perhaps that's related.

HTH,
Chris


On 8/12/19, 12:09 PM, "Users on behalf of Lentes, Bernd" 
<users-boun...@clusterlabs.org on behalf of bernd.len...@helmholtz-muenchen.de> 
wrote:

    Hi,
    
    last Friday (9th of August) i had to install patches on my two-node cluster.
    I put one of the nodes (ha-idg-2) into standby (crm node standby ha-idg-2), 
patched it, rebooted, 
    started the cluster (systemctl start pacemaker) again, put the node again 
online, everything fine.
    
    Then i wanted to do the same procedure with the other node (ha-idg-1).
    I put it in standby, patched it, rebooted, started pacemaker again.
    But then ha-idg-1 fenced ha-idg-2, it said the node is unclean.
    I know that nodes which are unclean need to be shutdown, that's logical.
    
    But i don't know from where the conclusion comes that the node is unclean 
respectively why it is unclean,
    i searched in the logs and didn't find any hint.
    
    I put the syslog and the pacemaker log on a seafile share, i'd be very 
thankful if you'll have a look.
    https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/
    
    Here the cli history of the commands:
    
    17:03:04  crm node standby ha-idg-2
    17:07:15  zypper up (install Updates on ha-idg-2)
    17:17:30  systemctl reboot
    17:25:21  systemctl start pacemaker.service
    17:25:47  crm node online ha-idg-2
    17:26:35  crm node standby ha-idg1-
    17:30:21  zypper up (install Updates on ha-idg-1)
    17:37:32  systemctl reboot
    17:43:04  systemctl start pacemaker.service
    17:44:00  ha-idg-1 is fenced
    
    Thanks.
    
    Bernd
    
    OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1
    
    
    -- 
    
    Bernd Lentes 
    Systemadministration 
    Institut für Entwicklungsgenetik 
    Gebäude 35.34 - Raum 208 
    HelmholtzZentrum münchen 
    bernd.len...@helmholtz-muenchen.de 
    phone: +49 89 3187 1241 
    phone: +49 89 3187 3827 
    fax: +49 89 3187 2294 
    http://www.helmholtz-muenchen.de/idg 
    
    Perfekt ist wer keine Fehler macht 
    Also sind Tote perfekt
     
    
    Helmholtz Zentrum Muenchen
    Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
    Ingolstaedter Landstr. 1
    85764 Neuherberg
    www.helmholtz-muenchen.de
    Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
    Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich 
Bassler, Kerstin Guenther
    Registergericht: Amtsgericht Muenchen HRB 6466
    USt-IdNr: DE 129521671
    
    _______________________________________________
    Manage your subscription:
    https://lists.clusterlabs.org/mailman/listinfo/users
    
    ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] why is node fenced ?

Reply via email to