Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Noticed oddity when DC is going to be fenced

2022-03-04 Thread Ken Gaillot
On Fri, 2022-03-04 at 08:17 +0100, Ulrich Windl wrote:
> > > > Ken Gaillot  schrieb am 02.03.2022 um
> > > > 16:10 in
> Nachricht
> :
> > On Wed, 2022-03-02 at 08:41 +0100, Ulrich Windl wrote:
> > > > > > Ken Gaillot  schrieb am 01.03.2022 um
> > > > > > 16:04 in
> > > Nachricht
> > > <463458e414f7c411eb1107335be6ee9a6e2d13ee.ca...@redhat.com>:
> > > > On Tue, 2022‑03‑01 at 10:05 +0100, Ulrich Windl wrote:
> > > > > Hi!
> > > > > 
> > > > > For current SLES15 SP3 I noticed an oddity when the node
> > > > > running
> > > > > the
> > > > > DC is going to be fenced:
> > > > > It seems that another node is performing recovery operations
> > > > > while
> > > > > the old DC is not confirmed to be fenced.
> > > > > 
> > > > > Like this (116 is the DC):
> > > > > Mar 01 01:33:53 h18 corosync[6754]:   [TOTEM ] A new
> > > > > membership
> > > > > (172.20.16.18:45612) was formed. Members left: 116
> > > > > 
> > > > > Mar 01 01:33:53 h18 corosync[6754]:   [MAIN  ] Completed
> > > > > service
> > > > > synchronization, ready to provide service.
> > > > > Mar 01 01:33:53 h18 pacemaker‑controld[6980]:  notice: Our
> > > > > peer
> > > > > on
> > > > > the DC (h16) is dead
> > > > > Mar 01 01:33:53 h18 pacemaker‑controld[6980]:  notice: State
> > > > > transition S_NOT_DC ‑> S_ELECTION
> > > > 
> > > > At this point, h16 loses its DC status, so there is no DC
> > > 
> > > I agree from the perspective of h18, but does the DC (h16) know,
> > > too?
> > > 
> > > > > Mar 01 01:33:53 h18 dlm_controld[8544]: 394518 fence request
> > > > > 116
> > > > > pid
> > > > > 16307 nodedown time 1646094833 fence_all dlm_stonith
> > > > > Mar 01 01:33:53 h18 pacemaker‑controld[6980]:  notice: State
> > > > > transition S_ELECTION ‑> S_INTEGRATION
> > > > 
> > > > At this point, a new DC election has completed, and h18 is now
> > > > the
> > > > DC
> > > > (as indicated by the scheduler messages later)
> > > 
> > > As above: Does h16 know at this point?
> > 
> > It doesn't matter, because the first thing the new DC will do is
> > fence
> > it. That's the beauty of fencing :)
> 
> I disagree:
> When there are possibly multiple DCs active, fencing must ensure
> FIRST that
> there is no other DC active BEFORE starting any action.
> Just imagine two non-cluster filesystems on shared storage are
> mounted on two
> different nodes concurrently.
> 
> Fencing the "other" DC at some later time does not help.
> 
> Regards,
> Ulrich

That's exactly what happens -- fencing is always scheduled before
resource recovery. There are some minor exceptions for things that are
safe, like stopping resources and starting fence devices.

> 
> > > > > Mar 01 01:33:53 h18 dlm_stonith[16307]: stonith_api_time:
> > > > > Found 1
> > > > > entries for 116/(null): 0 in progress, 0 completed
> > > > > Mar 01 01:33:53 h18 pacemaker‑fenced[6973]:  notice: Client
> > > > > stonith‑
> > > > > api.16307.4961743f wants to fence (reboot) '116' with device
> > > > > '(any)'
> > > > > Mar 01 01:33:53 h18 pacemaker‑fenced[6973]:  notice:
> > > > > Requesting
> > > > > peer
> > > > > fencing (reboot) targeting h16
> > > > > 
> > > > > Mar 01 01:33:53 h18 pacemaker‑schedulerd[6978]:  warning:
> > > > > Cluster
> > > > > node h16 will be fenced: peer is no longer part of the
> > > > > cluster
> > > > > Mar 01 01:33:53 h18 pacemaker‑schedulerd[6978]:  warning:
> > > > > Node
> > > > > h16 is
> > > > > unclean
> > > > > 
> > > > > (so far, so good)
> > > > > Mar 01 01:33:53 h18 pacemaker‑schedulerd[6978]:  warning:
> > > > > Scheduling
> > > > > Node h16 for STONITH
> > > > > Mar 01 01:33:53 h18 pacemaker‑schedulerd[6978]:  notice:  *
> > > > > Fence
> > > > > (reboot) h16 'peer is no longer part of the cluster'
> > > > > 
> > > > > Mar 01 01:33:53 h18 pacemaker‑controld[6980]:  notice:
> > > > > Initiating
> > > > > monitor operation prm_stonith_sbd_monitor_60 locally on
> > > > > h18
> > > > > Mar 01 01:33:53 h18 pacemaker‑controld[6980]:  notice:
> > > > > Requesting
> > > > > local execution of monitor operation for prm_stonith_sbd on
> > > > > h18
> > > > > Mar 01 01:33:53 h18 pacemaker‑controld[6980]:  notice:
> > > > > Initiating
> > > > > stop operation prm_cron_snap_v17_stop_0 on h19
> > > > > (isn't h18 playing DC already while h16 isn't fenced yet?)
> > > > > 
> > > > > Mar 01 01:35:23 h18 pacemaker‑controld[6980]:  error: Node
> > > > > h18
> > > > > did
> > > > > not send monitor result (via controller) within 9ms
> > > > > (action
> > > > > timeout plus cluster‑delay)
> > > > > Mar 01 01:35:23 h18 pacemaker‑controld[6980]:  error:
> > > > > [Action   26]:
> > > > > In‑flight resource op prm_stonith_sbd_monitor_60 on h18
> > > > > (priority: 9900, waiting: (null))
> > > > > Mar 01 01:35:23 h18 pacemaker‑controld[6980]:  notice:
> > > > > Transition
> > > > > 0
> > > > > aborted: Action lost
> > > > > Mar 01 01:35:23 h18 pacemaker‑controld[6980]:  warning:
> > > > > rsc_op
> > > > > 26:
> > > > > prm_stonith_sbd_monitor_60 on h18 timed out
> > > > > 

[ClusterLabs] Antw: Re: Antw: [EXT] Re: Noticed oddity when DC is going to be fenced

2022-03-03 Thread Ulrich Windl
>>> Ken Gaillot  schrieb am 02.03.2022 um 16:10 in
Nachricht
:
> On Wed, 2022-03-02 at 08:41 +0100, Ulrich Windl wrote:
>> > > > Ken Gaillot  schrieb am 01.03.2022 um
>> > > > 16:04 in
>> Nachricht
>> <463458e414f7c411eb1107335be6ee9a6e2d13ee.ca...@redhat.com>:
>> > On Tue, 2022‑03‑01 at 10:05 +0100, Ulrich Windl wrote:
>> > > Hi!
>> > > 
>> > > For current SLES15 SP3 I noticed an oddity when the node running
>> > > the
>> > > DC is going to be fenced:
>> > > It seems that another node is performing recovery operations
>> > > while
>> > > the old DC is not confirmed to be fenced.
>> > > 
>> > > Like this (116 is the DC):
>> > > Mar 01 01:33:53 h18 corosync[6754]:   [TOTEM ] A new membership
>> > > (172.20.16.18:45612) was formed. Members left: 116
>> > > 
>> > > Mar 01 01:33:53 h18 corosync[6754]:   [MAIN  ] Completed service
>> > > synchronization, ready to provide service.
>> > > Mar 01 01:33:53 h18 pacemaker‑controld[6980]:  notice: Our peer
>> > > on
>> > > the DC (h16) is dead
>> > > Mar 01 01:33:53 h18 pacemaker‑controld[6980]:  notice: State
>> > > transition S_NOT_DC ‑> S_ELECTION
>> > 
>> > At this point, h16 loses its DC status, so there is no DC
>> 
>> I agree from the perspective of h18, but does the DC (h16) know, too?
>> 
>> > > Mar 01 01:33:53 h18 dlm_controld[8544]: 394518 fence request 116
>> > > pid
>> > > 16307 nodedown time 1646094833 fence_all dlm_stonith
>> > > Mar 01 01:33:53 h18 pacemaker‑controld[6980]:  notice: State
>> > > transition S_ELECTION ‑> S_INTEGRATION
>> > 
>> > At this point, a new DC election has completed, and h18 is now the
>> > DC
>> > (as indicated by the scheduler messages later)
>> 
>> As above: Does h16 know at this point?
> 
> It doesn't matter, because the first thing the new DC will do is fence
> it. That's the beauty of fencing :)

I disagree:
When there are possibly multiple DCs active, fencing must ensure FIRST that
there is no other DC active BEFORE starting any action.
Just imagine two non-cluster filesystems on shared storage are mounted on two
different nodes concurrently.

Fencing the "other" DC at some later time does not help.

Regards,
Ulrich

> 
>> 
>> > > Mar 01 01:33:53 h18 dlm_stonith[16307]: stonith_api_time: Found 1
>> > > entries for 116/(null): 0 in progress, 0 completed
>> > > Mar 01 01:33:53 h18 pacemaker‑fenced[6973]:  notice: Client
>> > > stonith‑
>> > > api.16307.4961743f wants to fence (reboot) '116' with device
>> > > '(any)'
>> > > Mar 01 01:33:53 h18 pacemaker‑fenced[6973]:  notice: Requesting
>> > > peer
>> > > fencing (reboot) targeting h16
>> > > 
>> > > Mar 01 01:33:53 h18 pacemaker‑schedulerd[6978]:  warning: Cluster
>> > > node h16 will be fenced: peer is no longer part of the cluster
>> > > Mar 01 01:33:53 h18 pacemaker‑schedulerd[6978]:  warning: Node
>> > > h16 is
>> > > unclean
>> > > 
>> > > (so far, so good)
>> > > Mar 01 01:33:53 h18 pacemaker‑schedulerd[6978]:  warning:
>> > > Scheduling
>> > > Node h16 for STONITH
>> > > Mar 01 01:33:53 h18 pacemaker‑schedulerd[6978]:  notice:  * Fence
>> > > (reboot) h16 'peer is no longer part of the cluster'
>> > > 
>> > > Mar 01 01:33:53 h18 pacemaker‑controld[6980]:  notice: Initiating
>> > > monitor operation prm_stonith_sbd_monitor_60 locally on h18
>> > > Mar 01 01:33:53 h18 pacemaker‑controld[6980]:  notice: Requesting
>> > > local execution of monitor operation for prm_stonith_sbd on h18
>> > > Mar 01 01:33:53 h18 pacemaker‑controld[6980]:  notice: Initiating
>> > > stop operation prm_cron_snap_v17_stop_0 on h19
>> > > (isn't h18 playing DC already while h16 isn't fenced yet?)
>> > > 
>> > > Mar 01 01:35:23 h18 pacemaker‑controld[6980]:  error: Node h18
>> > > did
>> > > not send monitor result (via controller) within 9ms (action
>> > > timeout plus cluster‑delay)
>> > > Mar 01 01:35:23 h18 pacemaker‑controld[6980]:  error:
>> > > [Action   26]:
>> > > In‑flight resource op prm_stonith_sbd_monitor_60 on h18
>> > > (priority: 9900, waiting: (null))
>> > > Mar 01 01:35:23 h18 pacemaker‑controld[6980]:  notice: Transition
>> > > 0
>> > > aborted: Action lost
>> > > Mar 01 01:35:23 h18 pacemaker‑controld[6980]:  warning: rsc_op
>> > > 26:
>> > > prm_stonith_sbd_monitor_60 on h18 timed out
>> > > (whatever that means)
>> > > 
>> > > (now the fencing confirmation follows)
>> > > Mar 01 01:35:55 h18 pacemaker‑fenced[6973]:  notice: Operation
>> > > 'reboot' [16309] (call 2 from stonith‑api.16307) for host 'h16'
>> > > with
>> > > device 'prm_stonith_sbd' returned: 0 (OK)
>> > > Mar 01 01:35:55 h18 pacemaker‑fenced[6973]:  notice: Operation
>> > > 'reboot' targeting h16 on h18 for stonith‑api.16307@h18.36b9a9bb:
>> > > OK
>> > > Mar 01 01:35:55 h18 stonith‑api[16307]: stonith_api_kick: Node
>> > > 116/(null) kicked: reboot
>> > > Mar 01 01:35:55 h18 pacemaker‑fenced[6973]:  notice: Operation
>> > > 'reboot' targeting h16 on rksaph18 for 
>> > > pacemaker‑controld.6980@h18.8ce2f33f (merged): OK
>> > > Mar 01 01:35:55 h18 pacemaker‑controld[6980]:  notice: Peer h16