Hi! It seems your network connection is unreliable and you don't have a "second independent ring". You may increase the timeout (as suggested), but that doesn't really fix your networking.
Regards, Ulrich >>> vitaly <[email protected]> schrieb am 15.04.2022 um 14:26 in Nachricht <1442265456.65535.1650025606...@webmail6b.networksolutionsemail.com>: > Hello Everybody. > I am seeing occasionally the following behavior on two node cluster. > 1. Abruptly rebooting both nodes of the cluster (using "reboot") > 2. Both nodes start to come up. Node d18‑3‑left (2) comes up first > Apr 13 23:56:09 d18‑3‑left corosync[11465]: [MAIN ] Corosync Cluster Engine > ('2.4.4'): started and ready to provide service. > > 3. Second node d18‑3‑right (1) joins the cluster > > Apr 13 23:56:58 d18‑3‑left corosync[11466]: [TOTEM ] A new membership > (172.16.1.1:60) was formed. Members joined: 1 > Apr 13 23:56:58 d18‑3‑left corosync[11466]: [QUORUM] This node is within the > primary component and will provide service. > Apr 13 23:56:58 d18‑3‑left corosync[11466]: [QUORUM] Members[2]: 1 2 > Apr 13 23:56:58 d18‑3‑left corosync[11466]: [MAIN ] Completed service > synchronization, ready to provide service. > Apr 13 23:56:58 d18‑3‑left pacemakerd[11717]: notice: Quorum acquired > Apr 13 23:56:58 d18‑3‑left crmd[11763]: notice: Quorum acquired > > 4. 2 seconds later node d18‑3‑left shows I_DC_TIMEOUT and starts fencing of > the newly joined node. > > Apr 13 23:57:00 d18‑3‑left crmd[11763]: warning: Input I_DC_TIMEOUT received > in state S_PENDING from crm_timer_popped > After that we get: > Apr 13 23:57:00 d18‑3‑left crmd[11763]: notice: State transition S_ELECTION ‑> > S_INTEGRATION > Apr 13 23:57:00 d18‑3‑left crmd[11763]: warning: Input I_ELECTION_DC received > in state S_INTEGRATION from do_election_check > > and fence the node: > Apr 13 23:57:01 d18‑3‑left pengine[11762]: warning: Scheduling Node > d18‑3‑right.lab.archivas.com for STONITH > Apr 13 23:57:01 d18‑3‑left pengine[11762]: notice: * Fence (reboot) > d18‑3‑right.lab.archivas.com 'node is unclean' > > 5. After this the node that was fenced comes up again and joins the cluster > without any issues. > > Any idea on what is going on here? > Thanks, > _Vitaly > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
