Re: [ClusterLabs] Correctly stop pacemaker on 2-node cluster with SBD and failed devices?

Klaus Wenninger Tue, 15 Jun 2021 23:15:11 -0700

On Tue, Jun 15, 2021 at 10:41 PM Strahil Nikolov <hunter86...@yahoo.com>
wrote:


> Maybe you can try:
>
> while true ; do echo '0' > /proc/sys/kernel/nmi_watchdog ; sleep 1 ; done
>
> and in another shell stop pacemaker and sbd.
>
> I guess the only way to easily reproduce is with sbd over iscsi.
>
> Best Regards,
> Strahil Nikolov
>
> On Tue, Jun 15, 2021 at 21:30, Andrei Borzenkov
> <arvidj...@gmail.com> wrote:
> On 15.06.2021 20:48, Strahil Nikolov wrote:
> > I'm using 'pcs cluster stop' (or it's crm alternative),yet I'm not sure
> if it will help in this case.
> >
>
> No it won't. It will still stop pacemaker.
>
> Guess this is really a delicate issue and we might think of adding
some handle here. Although of course these kind of handles always
come with a certain amount of risk that they might be used in a
way that prevents a node from suiciding when it actually should.
Unfortunately the way 'pcs cluster stop' avoids suicides of single
nodes in larger clusters might not work here - first stop pacemaker
on all nodes and just then stop corosync to keep quorum for long enough
and to have a quick shutdown of the rest - as on a 2-node-cluster
sbd actually isn't checking for quorum but for the number of nodes
registered  with the corosync protocol pacemaker uses.

Regards,
Klaus

>
>
> > Most probably the safest way is to wait for the storage to be recovered,
> as without the pacemaker<->SBD communication , sbd will stop and the
> watchdog will be triggered.
>
> >
>
> What makes you think I am not aware of it?
>
> can you suggest the steps to avoid it?
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Correctly stop pacemaker on 2-node cluster with SBD and failed devices?

Reply via email to