Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-15 Thread Klaus Wenninger
On Sat, Nov 5, 2022 at 9:45 PM Jehan-Guillaume de Rorthais via Users <
users@clusterlabs.org> wrote:

> On Sat, 5 Nov 2022 20:53:09 +0100
> Valentin Vidić via Users  wrote:
>
> > On Sat, Nov 05, 2022 at 06:47:59PM +, Robert Hayden wrote:
> > > That was my impression as well...so I may have something wrong.  My
> > > expectation was that SBD daemon should be writing to the /dev/watchdog
> > > within 20 seconds and the kernel watchdog would self fence.
> >
> > I don't see anything unusual in the config except that pacemaker mode is
> > also enabled. This means that the cluster is providing signal for sbd
> even
> > when the storage device is down, for example:
> >
> > 883 ?SL 0:00 sbd: inquisitor
> > 892 ?SL 0:00  \_ sbd: watcher: /dev/vdb1 - slot: 0 - uuid:
> ...
> > 893 ?SL 0:00  \_ sbd: watcher: Pacemaker
> > 894 ?SL 0:00  \_ sbd: watcher: Cluster
> >
> > You can strace different sbd processes to see what they are doing at any
> > point.
>
> I suspect both watchers should detect the loss of network/communication
> with
> the other node.
>
> BUT, when sbd is in Pacemaker mode, it doesn't reset the node if the
> local **Pacemaker** is still quorate (via corosync). See the full chapter:
> «If Pacemaker integration is activated, SBD will not self-fence if
> **device**
> majority is lost [...]»
>
> https://documentation.suse.com/sle-ha/15-SP4/html/SLE-HA-all/cha-ha-storage-protect.html
>
> Would it be possible that no node is shutting down because the cluster is
> in
> two-node mode? Because of this mode, both would keep the quorum expecting
> the
> fencing to kill the other one... Except there's no active fencing here,
> only
> "self-fencing".
>

Seems not to be the case here but for completeness:
This fact should be recognized automatically by sbd (upstream since some
time
in 2017 iirc) and instead of checking quorum sbd would then check for
presence of 2 nodes with the cpg-group. I hope corosync prevents 2-node &
qdevice
set at the same time. But even in that case I would rather expect unexpected
self-fencing instead of the opposite.

Klaus


>
> To verify this guess, check the corosync conf for the "two_node" parameter
> and
> if both nodes still report as quorate during network outage using:
>
>   corosync-quorumtool -s
>
> If this turn to be a good guess, without **active** fencing, I suppose a
> cluster
> can not rely on the two-node mode. I'm not sure what would be the best
> setup
> though.
>
> Regards,
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-15 Thread Klaus Wenninger
On Wed, Nov 9, 2022 at 2:58 PM Robert Hayden 
wrote:

>
> > -Original Message-
> > From: Users  On Behalf Of Andrei
> > Borzenkov
> > Sent: Wednesday, November 9, 2022 2:59 AM
> > To: Cluster Labs - All topics related to open-source clustering welcomed
> > 
> > Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> >
> > On Mon, Nov 7, 2022 at 5:07 PM Robert Hayden
> >  wrote:
> > >
> > >
> > > > -Original Message-
> > > > From: Users  On Behalf Of Valentin
> > Vidic
> > > > via Users
> > > > Sent: Sunday, November 6, 2022 5:20 PM
> > > > To: users@clusterlabs.org
> > > > Cc: Valentin Vidić 
> > > > Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> > > >
> > > > On Sun, Nov 06, 2022 at 09:08:19PM +, Robert Hayden wrote:
> > > > > When SBD_PACEMAKER was set to "yes", the lack of network
> > connectivity
> > > > to the node
> > > > > would be seen and acted upon by the remote nodes (evicts and takes
> > > > > over ownership of the resources).  But the impacted node would just
> > > > > sit logging IO errors.  Pacemaker would keep updating the
> > /dev/watchdog
> > > > > device so SBD would not self evict.   Once I re-enabled the
> network,
> > then
> > > > the
> > > >
> > > > Interesting, not sure if this is the expected behaviour based on:
> > > >
> > > >
> >
> https://urldefense.com/v3/__https://lists.clusterlabs.org/pipermail/users/2


Which versions of pacemaker/corosync/sbd are you using?
iirc a result of the discussion linked was sbd checking watchdog-timeout
against sync-timeout in case of qdevice being used. default sync-timeout
is 30s and your watchdog-timeout is 20s. So I would expect kind of current
sbd should refuse startup.
But iirc in the discussion linked the pacemaker-node finally became
non-quorate.
There was just a possible split-brain-gap when sync-timeout >
watchdog-timeout.
So if your pacemaker-instance stays quorate it has to be something else
rather.


>
> > > > 017-
> > > >
> > August/022699.html__;!!ACWV5N9M2RV99hQ!IvnnhGI1HtTBGTKr4VFabWA
> > > > LeMfBWNhcS0FHsPFHwwQ3Riu5R3pOYLaQPNia-
> > > > GaB38wRJ7Eq4Q3GyT5C3s8y7w$
> > > >
> > > > Does SBD log "Majority of devices lost - surviving on pacemaker" or
> > > > some other messages related to Pacemaker?
> > >
> > > Yes.
> > >
> > > >
> > > > Also what is the status of Pacemaker when the network is down? Does
> it
> > > > report no quorum or something else?
> > > >
> > >
> > > Pacemaker on the failing node shows quorum even though it has lost
> > > communication to the Quorum Device and to the other node in the
> cluster.
> > > The non-failing node of the cluster can see the Quorum Device system
> and
> > > thus correctly determines to fence the failing node and take over its
> > > resources.
>

Hmm ... maybe some problem with qdevice-setup and/or quorum stategy (LMS
for instance).
If quorum doesn't work properly your cluster won't work properly regardless
of sbd killing the node properly or not.


> > >
> > > Only after I run firewall-cmd --panic-off, will the failing node start
> to log
> > > messages about loss of TOTEM and getting a new consensus with the
> > > now visible members.
> > >
> >
> > Where exactly do you use firewalld panic mode? You have hosts, you
> > have VM, you have qnode ...
> >
> > Have you verified that the network is blocked bidirectionally? I had
> > rather mixed experience with asymmetrical firewalls which resembles
> > your description.
>
> In my testing harness, I will send a script to the remote node which
> contains the firewall-cmd --panic-on, a sleep command, and then
> turn off the panic mode.  That way I can adjust the length of time
> network is unavailable on a single node.  I used to log into a network
> switch to turn ports off, but that is not possible in a Cloud environment.
> I have also played with manually creating iptables rules, but the panic
> mode
> is simply easier and accomplishes the task.
>
> I have verified that when panic mode is on, no inbound or outbound
> network traffic is allowed.   This includes iSCSI packets as well.  You
> better
> have access to the console or the ability to reset the system.
>
>
> >
> > Also it may depend on the corosync driver in use.
> 

Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-09 Thread Robert Hayden

> -Original Message-
> From: Users  On Behalf Of Andrei
> Borzenkov
> Sent: Wednesday, November 9, 2022 2:59 AM
> To: Cluster Labs - All topics related to open-source clustering welcomed
> 
> Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> 
> On Mon, Nov 7, 2022 at 5:07 PM Robert Hayden
>  wrote:
> >
> >
> > > -Original Message-
> > > From: Users  On Behalf Of Valentin
> Vidic
> > > via Users
> > > Sent: Sunday, November 6, 2022 5:20 PM
> > > To: users@clusterlabs.org
> > > Cc: Valentin Vidić 
> > > Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> > >
> > > On Sun, Nov 06, 2022 at 09:08:19PM +, Robert Hayden wrote:
> > > > When SBD_PACEMAKER was set to "yes", the lack of network
> connectivity
> > > to the node
> > > > would be seen and acted upon by the remote nodes (evicts and takes
> > > > over ownership of the resources).  But the impacted node would just
> > > > sit logging IO errors.  Pacemaker would keep updating the
> /dev/watchdog
> > > > device so SBD would not self evict.   Once I re-enabled the network,
> then
> > > the
> > >
> > > Interesting, not sure if this is the expected behaviour based on:
> > >
> > >
> https://urldefense.com/v3/__https://lists.clusterlabs.org/pipermail/users/2
> > > 017-
> > >
> August/022699.html__;!!ACWV5N9M2RV99hQ!IvnnhGI1HtTBGTKr4VFabWA
> > > LeMfBWNhcS0FHsPFHwwQ3Riu5R3pOYLaQPNia-
> > > GaB38wRJ7Eq4Q3GyT5C3s8y7w$
> > >
> > > Does SBD log "Majority of devices lost - surviving on pacemaker" or
> > > some other messages related to Pacemaker?
> >
> > Yes.
> >
> > >
> > > Also what is the status of Pacemaker when the network is down? Does it
> > > report no quorum or something else?
> > >
> >
> > Pacemaker on the failing node shows quorum even though it has lost
> > communication to the Quorum Device and to the other node in the cluster.
> > The non-failing node of the cluster can see the Quorum Device system and
> > thus correctly determines to fence the failing node and take over its
> > resources.
> >
> > Only after I run firewall-cmd --panic-off, will the failing node start to 
> > log
> > messages about loss of TOTEM and getting a new consensus with the
> > now visible members.
> >
> 
> Where exactly do you use firewalld panic mode? You have hosts, you
> have VM, you have qnode ...
> 
> Have you verified that the network is blocked bidirectionally? I had
> rather mixed experience with asymmetrical firewalls which resembles
> your description.

In my testing harness, I will send a script to the remote node which 
contains the firewall-cmd --panic-on, a sleep command, and then 
turn off the panic mode.  That way I can adjust the length of time
network is unavailable on a single node.  I used to log into a network 
switch to turn ports off, but that is not possible in a Cloud environment.
I have also played with manually creating iptables rules, but the panic mode
is simply easier and accomplishes the task.

I have verified that when panic mode is on, no inbound or outbound
network traffic is allowed.   This includes iSCSI packets as well.  You better
have access to the console or the ability to reset the system.


> 
> Also it may depend on the corosync driver in use.
> 
> > I think all of that explains the lack of self-fencing when the sbd setting 
> > of
> > SBD_PACEMAKER=yes is used.
> >
> 
> Correct. This means that at least under some conditions
> pacemaker/corosync fail to detect isolation.
> ___
> Manage your subscription:
> https://urldefense.com/v3/__https://lists.clusterlabs.org/mailman/listinfo/u
> sers__;!!ACWV5N9M2RV99hQ!IMFB2Teli90q80SZ0fS4861iqEF-
> yFGiPUvE81iTEJM4MHWMqoPOAxaJL5Fwmyr8py4S4QRvU4INEiY6YXvIH5c$
> 
> ClusterLabs home:
> https://urldefense.com/v3/__https://www.clusterlabs.org/__;!!ACWV5N9
> M2RV99hQ!IMFB2Teli90q80SZ0fS4861iqEF-
> yFGiPUvE81iTEJM4MHWMqoPOAxaJL5Fwmyr8py4S4QRvU4INEiY6sVTZv74$
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-09 Thread Andrei Borzenkov
On Mon, Nov 7, 2022 at 5:07 PM Robert Hayden  wrote:
>
>
> > -Original Message-
> > From: Users  On Behalf Of Valentin Vidic
> > via Users
> > Sent: Sunday, November 6, 2022 5:20 PM
> > To: users@clusterlabs.org
> > Cc: Valentin Vidić 
> > Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> >
> > On Sun, Nov 06, 2022 at 09:08:19PM +, Robert Hayden wrote:
> > > When SBD_PACEMAKER was set to "yes", the lack of network connectivity
> > to the node
> > > would be seen and acted upon by the remote nodes (evicts and takes
> > > over ownership of the resources).  But the impacted node would just
> > > sit logging IO errors.  Pacemaker would keep updating the /dev/watchdog
> > > device so SBD would not self evict.   Once I re-enabled the network, then
> > the
> >
> > Interesting, not sure if this is the expected behaviour based on:
> >
> > https://urldefense.com/v3/__https://lists.clusterlabs.org/pipermail/users/2
> > 017-
> > August/022699.html__;!!ACWV5N9M2RV99hQ!IvnnhGI1HtTBGTKr4VFabWA
> > LeMfBWNhcS0FHsPFHwwQ3Riu5R3pOYLaQPNia-
> > GaB38wRJ7Eq4Q3GyT5C3s8y7w$
> >
> > Does SBD log "Majority of devices lost - surviving on pacemaker" or
> > some other messages related to Pacemaker?
>
> Yes.
>
> >
> > Also what is the status of Pacemaker when the network is down? Does it
> > report no quorum or something else?
> >
>
> Pacemaker on the failing node shows quorum even though it has lost
> communication to the Quorum Device and to the other node in the cluster.
> The non-failing node of the cluster can see the Quorum Device system and
> thus correctly determines to fence the failing node and take over its
> resources.
>
> Only after I run firewall-cmd --panic-off, will the failing node start to log
> messages about loss of TOTEM and getting a new consensus with the
> now visible members.
>

Where exactly do you use firewalld panic mode? You have hosts, you
have VM, you have qnode ...

Have you verified that the network is blocked bidirectionally? I had
rather mixed experience with asymmetrical firewalls which resembles
your description.

Also it may depend on the corosync driver in use.

> I think all of that explains the lack of self-fencing when the sbd setting of
> SBD_PACEMAKER=yes is used.
>

Correct. This means that at least under some conditions
pacemaker/corosync fail to detect isolation.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-07 Thread Jehan-Guillaume de Rorthais via Users
On Mon, 7 Nov 2022 14:06:51 +
Robert Hayden  wrote:

> > -Original Message-
> > From: Users  On Behalf Of Valentin Vidic
> > via Users
> > Sent: Sunday, November 6, 2022 5:20 PM
> > To: users@clusterlabs.org
> > Cc: Valentin Vidić 
> > Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> > 
> > On Sun, Nov 06, 2022 at 09:08:19PM +, Robert Hayden wrote:  
> > > When SBD_PACEMAKER was set to "yes", the lack of network connectivity  
> > to the node  
> > > would be seen and acted upon by the remote nodes (evicts and takes
> > > over ownership of the resources).  But the impacted node would just
> > > sit logging IO errors.  Pacemaker would keep updating the /dev/watchdog
> > > device so SBD would not self evict.   Once I re-enabled the network, then
> > >  
> > the
> > 
> > Interesting, not sure if this is the expected behaviour based on:
> > 
> > https://urldefense.com/v3/__https://lists.clusterlabs.org/pipermail/users/2
> > 017-
> > August/022699.html__;!!ACWV5N9M2RV99hQ!IvnnhGI1HtTBGTKr4VFabWA
> > LeMfBWNhcS0FHsPFHwwQ3Riu5R3pOYLaQPNia-
> > GaB38wRJ7Eq4Q3GyT5C3s8y7w$
> > 
> > Does SBD log "Majority of devices lost - surviving on pacemaker" or
> > some other messages related to Pacemaker?  
> 
> Yes.
> 
> > 
> > Also what is the status of Pacemaker when the network is down? Does it
> > report no quorum or something else?
> >   
> 
> Pacemaker on the failing node shows quorum even though it has lost 
> communication to the Quorum Device and to the other node in the cluster.

This is the main issue. Maybe inspecting the corosync-cmapctl output could shed
some lights on some setup we are missing?

> The non-failing node of the cluster can see the Quorum Device system and 
> thus correctly determines to fence the failing node and take over its 
> resources.

Normal.

> Only after I run firewall-cmd --panic-off, will the failing node start to log
> messages about loss of TOTEM and getting a new consensus with the 
> now visible members.
> 
> I think all of that explains the lack of self-fencing when the sbd setting of
> SBD_PACEMAKER=yes is used.

I'm not sure. If I understand correctly, SBD_PACEMAKER=yes only instruct sbd to
keep an eye on the pacemaker+corosync processes (as described up thread). It
doesn't explain why Pacemaker keeps holding the quorum, but I might miss
something...
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-07 Thread Robert Hayden


> -Original Message-
> From: Users  On Behalf Of Valentin Vidic
> via Users
> Sent: Sunday, November 6, 2022 5:20 PM
> To: users@clusterlabs.org
> Cc: Valentin Vidić 
> Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> 
> On Sun, Nov 06, 2022 at 09:08:19PM +, Robert Hayden wrote:
> > When SBD_PACEMAKER was set to "yes", the lack of network connectivity
> to the node
> > would be seen and acted upon by the remote nodes (evicts and takes
> > over ownership of the resources).  But the impacted node would just
> > sit logging IO errors.  Pacemaker would keep updating the /dev/watchdog
> > device so SBD would not self evict.   Once I re-enabled the network, then
> the
> 
> Interesting, not sure if this is the expected behaviour based on:
> 
> https://urldefense.com/v3/__https://lists.clusterlabs.org/pipermail/users/2
> 017-
> August/022699.html__;!!ACWV5N9M2RV99hQ!IvnnhGI1HtTBGTKr4VFabWA
> LeMfBWNhcS0FHsPFHwwQ3Riu5R3pOYLaQPNia-
> GaB38wRJ7Eq4Q3GyT5C3s8y7w$
> 
> Does SBD log "Majority of devices lost - surviving on pacemaker" or
> some other messages related to Pacemaker?

Yes.

> 
> Also what is the status of Pacemaker when the network is down? Does it
> report no quorum or something else?
> 

Pacemaker on the failing node shows quorum even though it has lost 
communication to the Quorum Device and to the other node in the cluster.
The non-failing node of the cluster can see the Quorum Device system and 
thus correctly determines to fence the failing node and take over its 
resources.

Only after I run firewall-cmd --panic-off, will the failing node start to log
messages about loss of TOTEM and getting a new consensus with the 
now visible members.

I think all of that explains the lack of self-fencing when the sbd setting of
SBD_PACEMAKER=yes is used.

> --
> Valentin
> ___
> Manage your subscription:
> https://urldefense.com/v3/__https://lists.clusterlabs.org/mailman/listinfo/u
> sers__;!!ACWV5N9M2RV99hQ!IvnnhGI1HtTBGTKr4VFabWALeMfBWNhcS0F
> HsPFHwwQ3Riu5R3pOYLaQPNia-GaB38wRJ7Eq4Q3GyT4d-yBlAA$
> 
> ClusterLabs home:
> https://urldefense.com/v3/__https://www.clusterlabs.org/__;!!ACWV5N9
> M2RV99hQ!IvnnhGI1HtTBGTKr4VFabWALeMfBWNhcS0FHsPFHwwQ3Riu5R3
> pOYLaQPNia-GaB38wRJ7Eq4Q3GyT6dCiE15w$
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-06 Thread Valentin Vidić via Users
On Sun, Nov 06, 2022 at 09:08:19PM +, Robert Hayden wrote:
> When SBD_PACEMAKER was set to "yes", the lack of network connectivity to the 
> node 
> would be seen and acted upon by the remote nodes (evicts and takes
> over ownership of the resources).  But the impacted node would just 
> sit logging IO errors.  Pacemaker would keep updating the /dev/watchdog 
> device so SBD would not self evict.   Once I re-enabled the network, then the

Interesting, not sure if this is the expected behaviour based on:

https://lists.clusterlabs.org/pipermail/users/2017-August/022699.html

Does SBD log "Majority of devices lost - surviving on pacemaker" or
some other messages related to Pacemaker?

Also what is the status of Pacemaker when the network is down? Does it
report no quorum or something else?

-- 
Valentin
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-06 Thread Robert Hayden

> -Original Message-
> From: Jehan-Guillaume de Rorthais 
> Sent: Saturday, November 5, 2022 4:18 PM
> To: Robert Hayden 
> Cc: users@clusterlabs.org
> Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> 
> On Sat, 5 Nov 2022 20:54:55 +
> Robert Hayden  wrote:
> 
> > > -Original Message-
> > > From: Jehan-Guillaume de Rorthais 
> > > Sent: Saturday, November 5, 2022 3:45 PM
> > > To: users@clusterlabs.org
> > > Cc: Robert Hayden 
> > > Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> > >
> > > On Sat, 5 Nov 2022 20:53:09 +0100
> > > Valentin Vidić via Users  wrote:
> > >
> > > > On Sat, Nov 05, 2022 at 06:47:59PM +, Robert Hayden wrote:
> > > > > That was my impression as well...so I may have something wrong.  My
> > > > > expectation was that SBD daemon should be writing to the
> > > /dev/watchdog
> > > > > within 20 seconds and the kernel watchdog would self fence.
> > > >
> > > > I don't see anything unusual in the config except that pacemaker mode
> is
> > > > also enabled. This means that the cluster is providing signal for sbd 
> > > > even
> > > > when the storage device is down, for example:
> > > >
> > > > 883 ?SL 0:00 sbd: inquisitor
> > > > 892 ?SL 0:00  \_ sbd: watcher: /dev/vdb1 - slot: 0 - uuid: 
> > > > ...
> > > > 893 ?SL 0:00  \_ sbd: watcher: Pacemaker
> > > > 894 ?SL 0:00  \_ sbd: watcher: Cluster
> > > >
> > > > You can strace different sbd processes to see what they are doing at
> any
> > > > point.
> > >
> > > I suspect both watchers should detect the loss of
> network/communication
> > > with
> > > the other node.
> > >
> > > BUT, when sbd is in Pacemaker mode, it doesn't reset the node if the
> > > local **Pacemaker** is still quorate (via corosync). See the full chapter:
> > > «If Pacemaker integration is activated, SBD will not self-fence if
> > > **device** majority is lost [...]»
> > > https://urldefense.com/v3/__https://documentation.suse.com/sle-
> ha/15-
> > > SP4/html/SLE-HA-all/cha-ha-storage-
> > >
> protect.html__;!!ACWV5N9M2RV99hQ!LXxpjg0QHdAP0tvr809WCErcpPH0lx
> > > MKesDNqK-PU_Xpvb_KIGlj3uJcVLIbzQLViOi3EiSV3bkPUCHr$
> > >
> > > Would it be possible that no node is shutting down because the cluster is
> in
> > > two-node mode? Because of this mode, both would keep the quorum
> > > expecting the
> > > fencing to kill the other one... Except there's no active fencing here, 
> > > only
> > > "self-fencing".
> > >
> >
> > I failed to mention I also have a Quorum Device also setup to add its vote 
> > to
> > the quorum. So two_node is not enabled.
> 
> oh, ok.
> 
> > I suspect Valentin was onto to something with pacemaker keeping the
> watchdog
> > device updated as it thinks the cluster is ok.  Need to research and test
> > that theory out.  I will try to carve some time out next week for that.
> 
> AFAIK, Pacemaker strictly rely on SBD to deal with the watchdog. It doesn't
> feed
> it by itself.
> 
> In Pacemaker mode, SBD is watching the two most important part of the
> cluster:
> Pacemaker and Corosync:
> 
> * the "Pacemaker watcher" of SBD connects to the CIB and check it's still
>   updated on a regular basis and the self-node is marked online.
> * the "Cluster watchers" all connect with each others using a dedicated
>   communication group in corosync ring(s).
> 
> Both watchers can report a failure to SBD that would self-stop the node.
> 
> If the network if down, I suppose the cluster watcher should complain. But I
> suspect Pacemaker somehow keeps reporting as quorate, thus, forbidding
> SBD to
> kill the whole node...

I was able to reset and re-test today.   Ends up that the watchdog device
was being updated by pacemaker due to the /etc/sysconfig/sbd entry:

SBD_PACEMAKER=yes.

When I set that to "no", then after running "firewall-cmd --panic-on" 
command, the sbd daemon detected the lack of activity on 
/dev/watchdog and self fenced the node within seconds.  Exactly 
what I was expecting.

When SBD_PACEMAKER was set to "yes", the lack of network connectivity to the 
node 
would be seen and acted upon by the remote nodes (evicts and takes
over ownership of the resources).  But the impacted node would just 
sit logging IO errors.  Pacema

Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-05 Thread Jehan-Guillaume de Rorthais via Users
On Sat, 5 Nov 2022 20:54:55 +
Robert Hayden  wrote:

> > -Original Message-
> > From: Jehan-Guillaume de Rorthais 
> > Sent: Saturday, November 5, 2022 3:45 PM
> > To: users@clusterlabs.org
> > Cc: Robert Hayden 
> > Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> > 
> > On Sat, 5 Nov 2022 20:53:09 +0100
> > Valentin Vidić via Users  wrote:
> >   
> > > On Sat, Nov 05, 2022 at 06:47:59PM +, Robert Hayden wrote:  
> > > > That was my impression as well...so I may have something wrong.  My
> > > > expectation was that SBD daemon should be writing to the  
> > /dev/watchdog  
> > > > within 20 seconds and the kernel watchdog would self fence.  
> > >
> > > I don't see anything unusual in the config except that pacemaker mode is
> > > also enabled. This means that the cluster is providing signal for sbd even
> > > when the storage device is down, for example:
> > >
> > > 883 ?SL 0:00 sbd: inquisitor
> > > 892 ?SL 0:00  \_ sbd: watcher: /dev/vdb1 - slot: 0 - uuid: ...
> > > 893 ?SL 0:00  \_ sbd: watcher: Pacemaker
> > > 894 ?SL 0:00  \_ sbd: watcher: Cluster
> > >
> > > You can strace different sbd processes to see what they are doing at any
> > > point.  
> > 
> > I suspect both watchers should detect the loss of network/communication
> > with
> > the other node.
> > 
> > BUT, when sbd is in Pacemaker mode, it doesn't reset the node if the
> > local **Pacemaker** is still quorate (via corosync). See the full chapter:
> > «If Pacemaker integration is activated, SBD will not self-fence if
> > **device** majority is lost [...]»
> > https://urldefense.com/v3/__https://documentation.suse.com/sle-ha/15-
> > SP4/html/SLE-HA-all/cha-ha-storage-
> > protect.html__;!!ACWV5N9M2RV99hQ!LXxpjg0QHdAP0tvr809WCErcpPH0lx
> > MKesDNqK-PU_Xpvb_KIGlj3uJcVLIbzQLViOi3EiSV3bkPUCHr$
> > 
> > Would it be possible that no node is shutting down because the cluster is in
> > two-node mode? Because of this mode, both would keep the quorum
> > expecting the
> > fencing to kill the other one... Except there's no active fencing here, only
> > "self-fencing".
> >   
> 
> I failed to mention I also have a Quorum Device also setup to add its vote to
> the quorum. So two_node is not enabled. 

oh, ok.

> I suspect Valentin was onto to something with pacemaker keeping the watchdog
> device updated as it thinks the cluster is ok.  Need to research and test
> that theory out.  I will try to carve some time out next week for that.

AFAIK, Pacemaker strictly rely on SBD to deal with the watchdog. It doesn't feed
it by itself.

In Pacemaker mode, SBD is watching the two most important part of the cluster:
Pacemaker and Corosync:

* the "Pacemaker watcher" of SBD connects to the CIB and check it's still
  updated on a regular basis and the self-node is marked online.
* the "Cluster watchers" all connect with each others using a dedicated
  communication group in corosync ring(s).

Both watchers can report a failure to SBD that would self-stop the node.

If the network if down, I suppose the cluster watcher should complain. But I
suspect Pacemaker somehow keeps reporting as quorate, thus, forbidding SBD to
kill the whole node...

> Appreciate all of the feedback.  I have been dealing with Cluster Suite for a
> decade+ but focused on the company's setup.  I still have lots to learn,
> which keeps me interested.

+1

Keep us informed!

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-05 Thread Robert Hayden
> -Original Message-
> From: Jehan-Guillaume de Rorthais 
> Sent: Saturday, November 5, 2022 3:45 PM
> To: users@clusterlabs.org
> Cc: Robert Hayden 
> Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> 
> On Sat, 5 Nov 2022 20:53:09 +0100
> Valentin Vidić via Users  wrote:
> 
> > On Sat, Nov 05, 2022 at 06:47:59PM +, Robert Hayden wrote:
> > > That was my impression as well...so I may have something wrong.  My
> > > expectation was that SBD daemon should be writing to the
> /dev/watchdog
> > > within 20 seconds and the kernel watchdog would self fence.
> >
> > I don't see anything unusual in the config except that pacemaker mode is
> > also enabled. This means that the cluster is providing signal for sbd even
> > when the storage device is down, for example:
> >
> > 883 ?SL 0:00 sbd: inquisitor
> > 892 ?SL 0:00  \_ sbd: watcher: /dev/vdb1 - slot: 0 - uuid: ...
> > 893 ?SL 0:00  \_ sbd: watcher: Pacemaker
> > 894 ?SL 0:00  \_ sbd: watcher: Cluster
> >
> > You can strace different sbd processes to see what they are doing at any
> > point.
> 
> I suspect both watchers should detect the loss of network/communication
> with
> the other node.
> 
> BUT, when sbd is in Pacemaker mode, it doesn't reset the node if the
> local **Pacemaker** is still quorate (via corosync). See the full chapter:
> «If Pacemaker integration is activated, SBD will not self-fence if **device**
> majority is lost [...]»
> https://urldefense.com/v3/__https://documentation.suse.com/sle-ha/15-
> SP4/html/SLE-HA-all/cha-ha-storage-
> protect.html__;!!ACWV5N9M2RV99hQ!LXxpjg0QHdAP0tvr809WCErcpPH0lx
> MKesDNqK-PU_Xpvb_KIGlj3uJcVLIbzQLViOi3EiSV3bkPUCHr$
> 
> Would it be possible that no node is shutting down because the cluster is in
> two-node mode? Because of this mode, both would keep the quorum
> expecting the
> fencing to kill the other one... Except there's no active fencing here, only
> "self-fencing".
> 

I failed to mention I also have a Quorum Device also setup to add its vote to 
the quorum.  
So two_node is not enabled.  I suspect Valentin was onto to something with 
pacemaker keeping
the watchdog device updated as it thinks the cluster is ok.  Need to research 
and 
test that theory out.  I will try to carve some time out next week for that.

Appreciate all of the feedback.  I have been dealing with Cluster Suite for a 
decade+
but focused on the company's setup.  I still have lots to learn, which keeps me
interested.

> To verify this guess, check the corosync conf for the "two_node" parameter
> and
> if both nodes still report as quorate during network outage using:
> 
>   corosync-quorumtool -s
> 
> If this turn to be a good guess, without **active** fencing, I suppose a
> cluster
> can not rely on the two-node mode. I'm not sure what would be the best
> setup
> though.
> 
> Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-05 Thread Jehan-Guillaume de Rorthais via Users
On Sat, 5 Nov 2022 20:53:09 +0100
Valentin Vidić via Users  wrote:

> On Sat, Nov 05, 2022 at 06:47:59PM +, Robert Hayden wrote:
> > That was my impression as well...so I may have something wrong.  My
> > expectation was that SBD daemon should be writing to the /dev/watchdog
> > within 20 seconds and the kernel watchdog would self fence.  
> 
> I don't see anything unusual in the config except that pacemaker mode is
> also enabled. This means that the cluster is providing signal for sbd even
> when the storage device is down, for example:
> 
> 883 ?SL 0:00 sbd: inquisitor
> 892 ?SL 0:00  \_ sbd: watcher: /dev/vdb1 - slot: 0 - uuid: ...
> 893 ?SL 0:00  \_ sbd: watcher: Pacemaker
> 894 ?SL 0:00  \_ sbd: watcher: Cluster
> 
> You can strace different sbd processes to see what they are doing at any
> point.

I suspect both watchers should detect the loss of network/communication with
the other node.

BUT, when sbd is in Pacemaker mode, it doesn't reset the node if the
local **Pacemaker** is still quorate (via corosync). See the full chapter:
«If Pacemaker integration is activated, SBD will not self-fence if **device**
majority is lost [...]»
https://documentation.suse.com/sle-ha/15-SP4/html/SLE-HA-all/cha-ha-storage-protect.html

Would it be possible that no node is shutting down because the cluster is in
two-node mode? Because of this mode, both would keep the quorum expecting the
fencing to kill the other one... Except there's no active fencing here, only
"self-fencing".

To verify this guess, check the corosync conf for the "two_node" parameter and
if both nodes still report as quorate during network outage using:

  corosync-quorumtool -s

If this turn to be a good guess, without **active** fencing, I suppose a cluster
can not rely on the two-node mode. I'm not sure what would be the best setup
though.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-05 Thread Valentin Vidić via Users
On Sat, Nov 05, 2022 at 06:47:59PM +, Robert Hayden wrote:
> That was my impression as well...so I may have something wrong.  My 
> expectation was that SBD daemon
> should be writing to the /dev/watchdog within 20 seconds and the kernel 
> watchdog would self fence.

I don't see anything unusual in the config except that pacemaker mode is
also enabled. This means that the cluster is providing signal for sbd even
when the storage device is down, for example:

883 ?SL 0:00 sbd: inquisitor
892 ?SL 0:00  \_ sbd: watcher: /dev/vdb1 - slot: 0 - uuid: 
18b958fa-fdae-455a-aa9d-a204a6eed04b
893 ?SL 0:00  \_ sbd: watcher: Pacemaker
894 ?SL 0:00  \_ sbd: watcher: Cluster

You can strace different sbd processes to see what they are doing at any point.

Easy way to test if watchdog is working is to pause all sbd processes, for 
example:

# pkill -STOP sbd

For me this causes a node reset after 5 seconds as defined by: 
SBD_WATCHDOG_TIMEOUT=5

-- 
Valentin
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-05 Thread Robert Hayden
> -Original Message-
> From: Users  On Behalf Of Valentin Vidic
> via Users
> Sent: Saturday, November 5, 2022 1:07 PM
> To: users@clusterlabs.org
> Cc: Valentin Vidić 
> Subject: Re: [ClusterLabs] [External] : Re: Fence Agent tests
> 
> On Sat, Nov 05, 2022 at 05:20:47PM +, Robert Hayden wrote:
> > The OCI compute instances don't have a hardware watchdog, only the
> software watchdog.
> > So, when the network goes completely hung (e.g. firewall-cmd panic-on),
> all network
> > traffic stops which implies that IO to the SBD device also stops.  I do not 
> > see
> the software
> > watchdog take any action in response to the network hang.
> 
> It seems like the watchdog is not working or is not configured with a
> correct timeout here. sbd will not refresh the watchdog if it fails to
> read from the disk, so the watchdog should eventually expire and reset
> the node.

That was my impression as well...so I may have something wrong.  My expectation 
was that SBD daemon
should be writing to the /dev/watchdog within 20 seconds and the kernel 
watchdog would self fence.

Here is my setup
root:dh2vgmprepap02:ablgmprep:/root:# grep ^SBD /etc/sysconfig/sbd
SBD_DEVICE=/dev/disk/by-id/scsi-360e59ebc0f414569bcc7a5e4a6d58ccb-part1
SBD_PACEMAKER=yes
SBD_STARTMODE=always
SBD_DELAY_START=no
SBD_WATCHDOG_DEV=/dev/watchdog
SBD_WATCHDOG_TIMEOUT=5
SBD_TIMEOUT_ACTION=flush,reboot
SBD_MOVE_TO_ROOT_CGROUP=auto
SBD_OPTS=

root:dh2vgmprepap02:ablgmprep:/root:# sbd -d 
/dev/disk/by-id/scsi-360e59ebc0f414569bcc7a5e4a6d58ccb-part1 dump
==Dumping header on disk 
/dev/disk/by-id/scsi-360e59ebc0f414569bcc7a5e4a6d58ccb-part1
Header version : 2.1
UUID   : 04096cc5-1fb8-44da-9c4f-4b6034a0fe06
Number of slots: 255
Sector size: 512
Timeout (watchdog) : 20
Timeout (allocate) : 2
Timeout (loop) : 1
Timeout (msgwait)  : 40
==Header on disk /dev/disk/by-id/scsi-360e59ebc0f414569bcc7a5e4a6d58ccb-part1 
is dumped

root:dh2vgmprepap02:ablgmprep:/root:# pcs stonith sbd status  --full
SBD STATUS
:  |  | 
dh2vgmprepap03: YES | YES | YES
dh2vgmprepap02: YES | YES | YES

Messages list on device 
'/dev/disk/by-id/scsi-360e59ebc0f414569bcc7a5e4a6d58ccb-part1':
0   dh2vgmprepap03  clear
1   dh2vgmprepap02  clear


SBD header on device 
'/dev/disk/by-id/scsi-360e59ebc0f414569bcc7a5e4a6d58ccb-part1':
==Dumping header on disk 
/dev/disk/by-id/scsi-360e59ebc0f414569bcc7a5e4a6d58ccb-part1
Header version : 2.1
UUID   : 04096cc5-1fb8-44da-9c4f-4b6034a0fe06
Number of slots: 255
Sector size: 512
Timeout (watchdog) : 20
Timeout (allocate) : 2
Timeout (loop) : 1
Timeout (msgwait)  : 40
==Header on disk /dev/disk/by-id/scsi-360e59ebc0f414569bcc7a5e4a6d58ccb-part1 
is dumped


> 
> --
> Valentin
> ___
> Manage your subscription:
> https://urldefense.com/v3/__https://lists.clusterlabs.org/mailman/listinfo/u
> sers__;!!ACWV5N9M2RV99hQ!LPMOKgky02sAjkujkuJM8HLR5G5hAfCaQGPF
> Zymg81e8rf3Z1klCgoi4HAicoJr6wBEhEvnYaLZ6G1vRBDTKyw$
> 
> ClusterLabs home:
> https://urldefense.com/v3/__https://www.clusterlabs.org/__;!!ACWV5N9
> M2RV99hQ!LPMOKgky02sAjkujkuJM8HLR5G5hAfCaQGPFZymg81e8rf3Z1klCg
> oi4HAicoJr6wBEhEvnYaLZ6G1tNVtP0BA$
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-05 Thread Valentin Vidić via Users
On Sat, Nov 05, 2022 at 05:20:47PM +, Robert Hayden wrote:
> The OCI compute instances don't have a hardware watchdog, only the software 
> watchdog.
> So, when the network goes completely hung (e.g. firewall-cmd panic-on), all 
> network 
> traffic stops which implies that IO to the SBD device also stops.  I do not 
> see the software
> watchdog take any action in response to the network hang.

It seems like the watchdog is not working or is not configured with a
correct timeout here. sbd will not refresh the watchdog if it fails to
read from the disk, so the watchdog should eventually expire and reset
the node.

-- 
Valentin
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-05 Thread Robert Hayden
> -Original Message-
> From: Users  On Behalf Of Andrei
> Borzenkov
> Sent: Saturday, November 5, 2022 1:17 AM
> To: users@clusterlabs.org
> Subject: [External] : Re: [ClusterLabs] Fence Agent tests
> 
> On 04.11.2022 23:46, Robert Hayden wrote:
> > I am working on a Fencing agent for the Oracle Cloud Infrastructure (OCI)
> environment to complete power fencing of compute instances.  The only
> fencing setups I have seen for OCI are using SBD, but that is insufficient 
> with
> full network interruptions since OCI uses iSCSI to write/read to the SBD disk.
> >
> 
> Out of curiosity - why is it insufficient? If cluster node is completely
> isolated, it should commit suicide. If host where cluster node is
> running is completely isolated, then you cannot do anything with this
> host anyway.

Personally, this was my first attempt with SBD, so I may be missing some core 
protections.  I am more
familiar with IPMILAN power fencing.  In my testing with full network hang 
(firewall-cmd panic-on), I was 
not getting the expected fencing results with SBD like I would with power 
fencing.  Hence, my long 
overdue learning of python to then attempt taking a crack at writing a fencing 
agent.

In my configuration, I am using HA-LVM (vg tags) to protect XFS file systems.  
When
resources fail over, the file system moves to another node.   

The OCI compute instances don't have a hardware watchdog, only the software 
watchdog.
So, when the network goes completely hung (e.g. firewall-cmd panic-on), all 
network 
traffic stops which implies that IO to the SBD device also stops.  I do not see 
the software
watchdog take any action in response to the network hang.   The remote node will
see the network issue and write out the reset message in the SBD device slot 
for the hung
node to suicide.  But the impacted node cannot read that SBD device, so it 
never 
gets the message.  It just sits.  Applications can still run, but they don't 
have access to
the disks either (which is good).  In the full network hang, the remote node 
will wait until 2x SBD msg-timeout
and then assumes fencing was successful.  It then will attempt move the XFS 
file systems over.  
If the network-hung node wakes up, then I now have the XFS file systems mounted 
on both nodes
leading to corruption.   

This may be eliminated if I moved the HA-LVM setup from the vg_tags to
system_id.   With vg_tags, pacemaker adds a "pacemaker" tag to all controlled 
volume groups regardless
of the node that has the vg activated.  With system_id, the nodes uname is 
added to the vg metadata
so each node knows who officially has the vg activated.   I have not played 
with that scenario in OCI 
just yet.  I am not sure if pacemaker would simply remove the other node's 
uname and add its own
when it attempts to move the resource.   It is on my list to test because we 
moved to uname setup 
with Linux 8.

Again, this was my first attempt with SBD, so I may have it setup completely 
wrong.

> 
> I am not familiar with OCI architecture so I may be missing something
> obvious here.
> 
> 
> ___
> Manage your subscription:
> https://urldefense.com/v3/__https://lists.clusterlabs.org/mailman/listinfo/u
> sers__;!!ACWV5N9M2RV99hQ!P-TvBc3_Pt-
> EGjuAuWw7Fa8vFbMYbE3gi73KUfdyxDBCXFuCWXcbdHNm63_AkgmJ5vhcNX
> mcIkMgXSBGaphrfZQ$
> 
> ClusterLabs home:
> https://urldefense.com/v3/__https://www.clusterlabs.org/__;!!ACWV5N9
> M2RV99hQ!P-TvBc3_Pt-
> EGjuAuWw7Fa8vFbMYbE3gi73KUfdyxDBCXFuCWXcbdHNm63_AkgmJ5vhcNX
> mcIkMgXSBGT2ncT5M$
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/