Set the corosync token to 10000 miliseconds and adjust the consensus as per the 
man 5 corosync.conf and give it a try.
Don't forget to sync the corosync settings among the cluster.
Best Regards,Strahil Nikolov
 
 
  On Fri, Apr 15, 2022 at 15:27, vitaly<[email protected]> wrote:   Hello 
Everybody.
I am seeing occasionally the following behavior on two node cluster. 
1. Abruptly rebooting both nodes of the cluster (using "reboot")
2. Both nodes start to come up. Node d18-3-left (2) comes up first 
Apr 13 23:56:09 d18-3-left corosync[11465]:  [MAIN  ] Corosync Cluster Engine 
('2.4.4'): started and ready to provide service.

3. Second node d18-3-right (1) joins the cluster

Apr 13 23:56:58 d18-3-left corosync[11466]:  [TOTEM ] A new membership 
(172.16.1.1:60) was formed. Members joined: 1
Apr 13 23:56:58 d18-3-left corosync[11466]:  [QUORUM] This node is within the 
primary component and will provide service.
Apr 13 23:56:58 d18-3-left corosync[11466]:  [QUORUM] Members[2]: 1 2
Apr 13 23:56:58 d18-3-left corosync[11466]:  [MAIN  ] Completed service 
synchronization, ready to provide service.
Apr 13 23:56:58 d18-3-left pacemakerd[11717]:  notice: Quorum acquired
Apr 13 23:56:58 d18-3-left crmd[11763]:  notice: Quorum acquired

4. 2 seconds later node d18-3-left shows I_DC_TIMEOUT and starts fencing of the 
newly joined node.

Apr 13 23:57:00 d18-3-left crmd[11763]:  warning: Input I_DC_TIMEOUT received 
in state S_PENDING from crm_timer_popped
After that we get:
Apr 13 23:57:00 d18-3-left crmd[11763]:  notice: State transition S_ELECTION -> 
S_INTEGRATION
Apr 13 23:57:00 d18-3-left crmd[11763]:  warning: Input I_ELECTION_DC received 
in state S_INTEGRATION from do_election_check

and fence the node:
Apr 13 23:57:01 d18-3-left pengine[11762]:  warning: Scheduling Node 
d18-3-right.lab.archivas.com for STONITH
Apr 13 23:57:01 d18-3-left pengine[11762]:  notice:  * Fence (reboot) 
d18-3-right.lab.archivas.com 'node is unclean'

5. After this the node that was fenced comes up again and joins the cluster 
without any issues.

Any idea on what is going on here?
Thanks,
_Vitaly
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/
  
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to