>>> shivraj dongawe <shivraj...@gmail.com> schrieb am 25.02.2021 um 07:34 in Nachricht <calpaho9w3scnddoqw1-yfmjhfmy8mz-i6cyu0cc3bx6x26c...@mail.gmail.com>: > @Ken Gaillot, Thanks for sharing your inputs on the possible behavior of > the cluster. > We have reconfirmed that dlm on a healthy node was waiting for fencing of > faulty node and shared storage access on the healthy node was blocked > during this process. > Kindly let me know whether this is the natural behavior or is it a result > of some misconfiguration. > As asked by I am sharing configuration information as an attachment to this > mail.
Hi! I think this is they way it's intended to be: If a node is "unclean" (faulty) then DLM waits for a confirmation of the unclean node becoming clean (i.e. being fenced, known to be off). Then a new cluster configuration (quorum) is formed, and possible recovery actions (like releasing locks the fenced node held) take place. I see with OCFS2 that I/O may hang while the cluster is waiting for a node being fenced. Regards, Ulrich > > > On Fri, Feb 19, 2021 at 11:28 PM Ken Gaillot <kgail...@redhat.com> wrote: > >> On Fri, 2021-02-19 at 07:48 +0530, shivraj dongawe wrote: >> > Any update on this . >> > Is there any issue in the configuration that we are using ? >> > >> > On Mon, Feb 15, 2021, 14:40 shivraj dongawe <shivraj...@gmail.com> >> > wrote: >> > > Kindly read "fencing is done using fence_scsi" from the previous >> > > message as "fencing is configured". >> > > >> > > As per the error messages we have analyzed node2 initiated fencing >> > > of node1 as many processes of node1 related to cluster have been >> > > killed by oom killer and node1 marked as down. >> > > Now many resources of node2 have waited for fencing of node1, as >> > > seen from following messages of syslog of node2: >> > > dlm_controld[1616]: 91659 lvm_postgres_db_vg wait for fencing >> > > dlm_controld[1616]: 91659 lvm_global wait for fencing >> > > >> > > These were messages when postgresql-12 service was being started on >> > > node2. >> > > As postgresql service is dependent on these services(dlm,lvmlockd >> > > and gfs2), it has not started in time on node2. >> > > And node2 fenced itself after declaring that services can not be >> > > started on it. >> > > >> > > On Mon, Feb 15, 2021 at 9:00 AM Ulrich Windl < >> > > ulrich.wi...@rz.uni-regensburg.de> wrote: >> > > > >>> shivraj dongawe <shivraj...@gmail.com> schrieb am 15.02.2021 >> > > > um 08:27 in >> > > > Nachricht >> > > > < >> > > > CALpaHO_6LsYM=t76CifsRkFeLYDKQc+hY3kz7PRKp7b4se=-a...@mail.gmail.com >> > > > >: >> > > > > Fencing is done using fence_scsi. >> > > > > Config details are as follows: >> > > > > Resource: scsi (class=stonith type=fence_scsi) >> > > > > Attributes: devices=/dev/mapper/mpatha pcmk_host_list="node1 >> > > > node2" >> > > > > pcmk_monitor_action=metadata pcmk_reboot_action=off >> > > > > Meta Attrs: provides=unfencing >> > > > > Operations: monitor interval=60s (scsi-monitor-interval-60s) >> > > > > >> > > > > On Mon, Feb 15, 2021 at 7:17 AM Ulrich Windl < >> > > > > ulrich.wi...@rz.uni-regensburg.de> wrote: >> > > > > >> > > > >> >>> shivraj dongawe <shivraj...@gmail.com> schrieb am >> > > > 14.02.2021 um 12:03 >> > > > >> in >> > > > >> Nachricht >> > > > >> < >> > > > calpaho--3erfwst70mbl-wm9g6yh3ytd-wda1r_cknbrsxu...@mail.gmail.com >> > > > >: >> > > > >> > We are running a two node cluster on Ubuntu 20.04 LTS. >> > > > Cluster related >> > > > >> > package version details are as >> > > > >> > follows: pacemaker/focal-updates,focal-security 2.0.3- >> > > > 3ubuntu4.1 amd64 >> > > > >> > pacemaker/focal 2.0.3-3ubuntu3 amd64 >> > > > >> > corosync/focal 3.0.3-2ubuntu2 amd64 >> > > > >> > pcs/focal 0.10.4-3 all >> > > > >> > fence-agents/focal 4.5.2-1 amd64 >> > > > >> > gfs2-utils/focal 3.2.0-3 amd64 >> > > > >> > dlm-controld/focal 4.0.9-1build1 amd64 >> > > > >> > lvm2-lockd/focal 2.03.07-1ubuntu1 amd64 >> > > > >> > >> > > > >> > Cluster configuration details: >> > > > >> > 1. Cluster is having a shared storage mounted through gfs2 >> > > > filesystem >> > > > >> with >> > > > >> > the help of dlm and lvmlockd. >> > > > >> > 2. Corosync is configured to use knet for transport. >> > > > >> > 3. Fencing is configured using fence_scsi on the shared >> > > > storage which is >> > > > >> > being used for gfs2 filesystem >> > > > >> > 4. Two main resources configured are cluster/virtual ip and >> > > > >> postgresql-12, >> > > > >> > postgresql-12 is configured as a systemd resource. >> > > > >> > We had done failover testing(rebooting/shutting down of a >> > > > node, link >> > > > >> > failure) of the cluster and had observed that resources were >> > > > getting >> > > > >> > migrated properly on the active node. >> > > > >> > >> > > > >> > Recently we came across an issue which has occurred >> > > > repeatedly in span of >> > > > >> > two days. >> > > > >> > Details are below: >> > > > >> > 1. Out of memory killer is getting invoked on active node >> > > > and it starts >> > > > >> > killing processes. >> > > > >> > Sample is as follows: >> > > > >> > postgres invoked oom-killer: >> > > > gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), >> > > > >> > order=0, oom_score_adj=0 >> > > > >> > 2. At one instance it started with killing of pacemaker and >> > > > on another >> > > > >> with >> > > > >> > postgresql. It does not stop with the killing of a single >> > > > process it goes >> > > > >> > on killing others(more concerning is killing of cluster >> > > > related >> > > > >> processes) >> > > > >> > as well. We have observed that swap space on that node is 2 >> > > > GB against >> > > > >> RAM >> > > > >> > of 96 GB and are in the process of increasing swap space to >> > > > see if this >> > > > >> > resolves this issue. Postgres is configured with >> > > > shared_buffers value of >> > > > >> 32 >> > > > >> > GB(which is way less than 96 GB). >> > > > >> > We are not yet sure which process is eating up that much >> > > > memory suddenly. >> > > > >> > 3. As a result of killing processes on node1, node2 is >> > > > trying to fence >> > > > >> > node1 and thereby initiating stopping of cluster resources >> > > > on node1. >> > > > >> >> > > > >> How is fencing being done? >> > > > >> >> > > > >> > 4. At this point we go in a stage where it is assumed that >> > > > node1 is down >> > > > >> > and application resources, cluster IP and postgresql are >> > > > being started on >> > > > >> > node2. >> > > > >> > > > This is why I was asking: Is your fencing successful ("assumed >> > > > that node1 is down >> > > > "), or isn't it? >> > > > >> > > > >> > 5. Postgresql on node 2 fails to start in 60 sec(start >> > > > operation timeout) >> > > > >> > and is declared as failed. During the start operation of >> > > > postgres, we >> > > > >> have >> > > > >> > found many messages related to failure of fencing and other >> > > > resources >> > > > >> such >> > > > >> > as dlm and vg waiting for fencing to complete. >> >> It does seem that DLM is where the problem occurs. >> >> Note that fencing is scheduled in two separate ways, once by DLM and >> once by the cluster itself, when node1 is lost. >> >> The fencing scheduled by the cluster completes successfully: >> >> Feb 13 11:07:56 DB-2 pacemaker-controld[2451]: notice: Peer node1 was >> terminated (reboot) by node2 on behalf of pacemaker-controld.2451: OK >> >> but DLM just attempts fencing over and over, eventually causing >> resource timeouts. Those timeouts cause the cluster to schedule >> resource recovery (stop+start), but the stops timeout for the same >> reason, and it is those stop timeouts that cause node2 to be fenced. >> >> I'm not familiar enough with DLM to know what might keep it from being >> able to contact Pacemaker for fencing. >> >> Can you attach your configuration as well (with any sensitive info >> removed)? I assume you've created an ocf:pacemaker:controld clone, and >> that the other resources are layered on top of that with colocation and >> ordering constraints. >> >> > > > >> > Details of syslog messages of node2 during this event are >> > > > attached in >> > > > >> file. >> > > > >> > 6. After this point we are in a state where node1 and node2 >> > > > both go in >> > > > >> > fenced state and resources are unrecoverable(all resources >> > > > on both >> > > > >> nodes). >> > > > >> > >> > > > >> > Now my question is out of memory issue of node1 can be taken >> > > > care by >> > > > >> > increasing swap and finding out the process responsible for >> > > > such huge >> > > > >> > memory usage and taking necessary actions to minimize that >> > > > memory usage, >> > > > >> > but the other issue that remains unclear is why cluster is >> > > > not shifted to >> > > > >> > node2 cleanly and become unrecoverable. >> > > > >> >> -- >> Ken Gaillot <kgail...@redhat.com> >> >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ >> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/