>>> Digimer <[email protected]> schrieb am 26.02.2021 um 17:34 in Nachricht <[email protected]>: > On 2021‑02‑26 11:19 a.m., Eric Robinson wrote: >> At 5:16 am Pacific time Monday, one of our cluster nodes failed and its >> mysql services went down. The cluster did not automatically recover. >> >> We’re trying to figure out: >> >> 1. Why did it fail? >> 2. Why did it not automatically recover? >> >> The cluster did not recover until we manually executed… >> >> # pcs resource cleanup p_mysql_622 >> >> OS: CentOS Linux release 7.5.1804 (Core) >> >> Cluster version: >> >> corosync.x86_64 2.4.5‑4.el7 @base >> corosync‑qdevice.x86_64 2.4.5‑4.el7 @base >> pacemaker.x86_64 1.1.21‑4.el7 @base >> >> Two nodes: 001db01a, 001db01b >> >> The following log snippet is from node 001db01a: >> >> [root@001db01a cluster]# grep "Feb 22 05:1[67]" corosync.log‑20210223 > > <snip> > >> Feb 22 05:16:30 [91682] 001db01a pengine: warning: cluster_status: > Fencing and resource management disabled due to lack of quorum > > Seems like there was no quorum from this node's perspective, so it won't > do anything. What does the other node's logs say?
@Digimer: The other node's log was included ;-) Regards, Ulrich _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
