Re: Deadlock between backend and recovery may not be detected

2021-01-05 Thread Fujii Masao




On 2021/01/06 11:48, Masahiko Sawada wrote:

On Tue, Jan 5, 2021 at 3:26 PM Fujii Masao  wrote:




On 2020/12/25 13:16, Kyotaro Horiguchi wrote:

At Wed, 23 Dec 2020 21:42:47 +0900, Fujii Masao  
wrote in

you. Attached
is the updated of the patch. What about this version?


The patch contains a hunk in the following structure.

+ if (got_standby_lock_timeout)
+ goto cleanup;
+
+ if (got_standby_deadlock_timeout)
+ {
...
+ }
+
+cleanup:

It is eqivalent to

+ if (!got_standby_lock_timeout && got_standby_deadlock_timeout)
+ {
...
+ }

Is there any reason for the goto?


Yes. That's because we have the following code using goto.

+   /* Quick exit if there's no work to be done */
+   if (!VirtualTransactionIdIsValid(*backends))
+   goto cleanup;


Regarding the back-patch, I was thinking to back-patch this to all the
supported branches. But I found that it's not easy to do that to v9.5
because v9.5 doesn't have some infrastructure code that this bug fix
patch depends on. So at least the commit 37c54863cf as the infrastructure
also needs to be back-patched to v9.5. And ISTM that some related commits
f868a8143a and 8f0de712c3 need to be back-patched. Probably there might
be some other changes to be back-patched. Unfortunately they cannot be
applied to v9.5 cleanly and additional changes are necessary.

This situation makes me feel that I'm inclined to skip the back-patch to v9.5.
Because the next minor version release is the final one for v9.5. So if we
unexpectedly introduce the bug to v9.5 by the back-patch, there is no
chance to fix that. OTOH, of course, if we don't do the back-patch, there is
no chance to fix the deadlock detection bug since the final minor version
for v9.5 doesn't include that bug fix. But I'm afraid that the back-patch
to v9.5 may give more risk than gain.

Thought?


+1 for not-backpatching to 9.5.


Thanks all! I pushed the patch and back-patched to v9.6.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Deadlock between backend and recovery may not be detected

2021-01-05 Thread Masahiko Sawada
On Tue, Jan 5, 2021 at 3:26 PM Fujii Masao  wrote:
>
>
>
> On 2020/12/25 13:16, Kyotaro Horiguchi wrote:
> > At Wed, 23 Dec 2020 21:42:47 +0900, Fujii Masao 
> >  wrote in
> >> you. Attached
> >> is the updated of the patch. What about this version?
> >
> > The patch contains a hunk in the following structure.
> >
> > + if (got_standby_lock_timeout)
> > + goto cleanup;
> > +
> > + if (got_standby_deadlock_timeout)
> > + {
> > ...
> > + }
> > +
> > +cleanup:
> >
> > It is eqivalent to
> >
> > + if (!got_standby_lock_timeout && got_standby_deadlock_timeout)
> > + {
> > ...
> > + }
> >
> > Is there any reason for the goto?
>
> Yes. That's because we have the following code using goto.
>
> +   /* Quick exit if there's no work to be done */
> +   if (!VirtualTransactionIdIsValid(*backends))
> +   goto cleanup;
>
>
> Regarding the back-patch, I was thinking to back-patch this to all the
> supported branches. But I found that it's not easy to do that to v9.5
> because v9.5 doesn't have some infrastructure code that this bug fix
> patch depends on. So at least the commit 37c54863cf as the infrastructure
> also needs to be back-patched to v9.5. And ISTM that some related commits
> f868a8143a and 8f0de712c3 need to be back-patched. Probably there might
> be some other changes to be back-patched. Unfortunately they cannot be
> applied to v9.5 cleanly and additional changes are necessary.
>
> This situation makes me feel that I'm inclined to skip the back-patch to v9.5.
> Because the next minor version release is the final one for v9.5. So if we
> unexpectedly introduce the bug to v9.5 by the back-patch, there is no
> chance to fix that. OTOH, of course, if we don't do the back-patch, there is
> no chance to fix the deadlock detection bug since the final minor version
> for v9.5 doesn't include that bug fix. But I'm afraid that the back-patch
> to v9.5 may give more risk than gain.
>
> Thought?

+1 for not-backpatching to 9.5.

Regards,

-- 
Masahiko Sawada
EnterpriseDB:  https://www.enterprisedb.com/




Re: Deadlock between backend and recovery may not be detected

2021-01-05 Thread Kyotaro Horiguchi
At Tue, 5 Jan 2021 15:26:50 +0900, Fujii Masao  
wrote in 
> 
> 
> On 2020/12/25 13:16, Kyotaro Horiguchi wrote:
> > At Wed, 23 Dec 2020 21:42:47 +0900, Fujii Masao
> >  wrote in
> >> you. Attached
> >> is the updated of the patch. What about this version?
> > The patch contains a hunk in the following structure.
> > +   if (got_standby_lock_timeout)
> > +   goto cleanup;
> > +
> > +   if (got_standby_deadlock_timeout)
> > +   {
> > ...
> > +   }
> > +
> > +cleanup:
> > It is eqivalent to
> > +   if (!got_standby_lock_timeout && got_standby_deadlock_timeout)
> > +   {
> > ...
> > +   }
> > Is there any reason for the goto?
> 
> Yes. That's because we have the following code using goto.
> 
> +   /* Quick exit if there's no work to be done */
> +   if (!VirtualTransactionIdIsValid(*backends))
> +   goto cleanup;

It doesn't seem to be the *cause*.  Such straight-forward logic with
three-depth indentation is not a thing that should be avoided using
goto even if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK is too lengty
and sticks out of 80 coloumns.

> Regarding the back-patch, I was thinking to back-patch this to all the
> supported branches. But I found that it's not easy to do that to v9.5
> because v9.5 doesn't have some infrastructure code that this bug fix
> patch depends on. So at least the commit 37c54863cf as the
> infrastructure
> also needs to be back-patched to v9.5. And ISTM that some related
> commits
> f868a8143a and 8f0de712c3 need to be back-patched. Probably there
> might
> be some other changes to be back-patched. Unfortunately they cannot be
> applied to v9.5 cleanly and additional changes are necessary.
> 
> This situation makes me feel that I'm inclined to skip the back-patch
> to v9.5.
> Because the next minor version release is the final one for v9.5. So
> if we
> unexpectedly introduce the bug to v9.5 by the back-patch, there is no
> chance to fix that. OTOH, of course, if we don't do the back-patch,
> there is
> no chance to fix the deadlock detection bug since the final minor
> version
> for v9.5 doesn't include that bug fix. But I'm afraid that the
> back-patch
> to v9.5 may give more risk than gain.
> 
> Thought?

It seems to me that the final minor release should get fixes only for
issues that we have actually gotten complaints on, or critical-ish
known issues such as ones lead to server crash in normal paths. This
particular issue is neither of them.

So +1 for not back-patching to 9.5.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Deadlock between backend and recovery may not be detected

2021-01-05 Thread Victor Yegorov
вт, 5 янв. 2021 г. в 07:26, Fujii Masao :

> This situation makes me feel that I'm inclined to skip the back-patch to
> v9.5.
> Because the next minor version release is the final one for v9.5. So if we
> unexpectedly introduce the bug to v9.5 by the back-patch, there is no
> chance to fix that. OTOH, of course, if we don't do the back-patch, there
> is
> no chance to fix the deadlock detection bug since the final minor version
> for v9.5 doesn't include that bug fix. But I'm afraid that the back-patch
> to v9.5 may give more risk than gain.
>
> Thought?
>

Honestly, I was thinking that this will not be backpatched at all
and really am glad we're getting this fixed in the back branches as well.

Therefore I think it's fine to skip 9.5, though I
would've mentioned this in the commit message.


-- 
Victor Yegorov


Re: Deadlock between backend and recovery may not be detected

2021-01-04 Thread Fujii Masao




On 2020/12/25 13:16, Kyotaro Horiguchi wrote:

At Wed, 23 Dec 2020 21:42:47 +0900, Fujii Masao  
wrote in

you. Attached
is the updated of the patch. What about this version?


The patch contains a hunk in the following structure.

+   if (got_standby_lock_timeout)
+   goto cleanup;
+
+   if (got_standby_deadlock_timeout)
+   {
...
+   }
+
+cleanup:

It is eqivalent to

+   if (!got_standby_lock_timeout && got_standby_deadlock_timeout)
+   {
...
+   }

Is there any reason for the goto?


Yes. That's because we have the following code using goto.

+   /* Quick exit if there's no work to be done */
+   if (!VirtualTransactionIdIsValid(*backends))
+   goto cleanup;


Regarding the back-patch, I was thinking to back-patch this to all the
supported branches. But I found that it's not easy to do that to v9.5
because v9.5 doesn't have some infrastructure code that this bug fix
patch depends on. So at least the commit 37c54863cf as the infrastructure
also needs to be back-patched to v9.5. And ISTM that some related commits
f868a8143a and 8f0de712c3 need to be back-patched. Probably there might
be some other changes to be back-patched. Unfortunately they cannot be
applied to v9.5 cleanly and additional changes are necessary.

This situation makes me feel that I'm inclined to skip the back-patch to v9.5.
Because the next minor version release is the final one for v9.5. So if we
unexpectedly introduce the bug to v9.5 by the back-patch, there is no
chance to fix that. OTOH, of course, if we don't do the back-patch, there is
no chance to fix the deadlock detection bug since the final minor version
for v9.5 doesn't include that bug fix. But I'm afraid that the back-patch
to v9.5 may give more risk than gain.

Thought?

Regards,  


--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Deadlock between backend and recovery may not be detected

2020-12-24 Thread Kyotaro Horiguchi
At Wed, 23 Dec 2020 21:42:47 +0900, Fujii Masao  
wrote in 
> you. Attached
> is the updated of the patch. What about this version?

The patch contains a hunk in the following structure.

+   if (got_standby_lock_timeout)
+   goto cleanup;
+
+   if (got_standby_deadlock_timeout)
+   {
...
+   }
+
+cleanup:

It is eqivalent to

+   if (!got_standby_lock_timeout && got_standby_deadlock_timeout)
+   {
...
+   }

Is there any reason for the goto?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Deadlock between backend and recovery may not be detected

2020-12-24 Thread Masahiko Sawada
On Wed, Dec 23, 2020 at 9:42 PM Fujii Masao  wrote:
>
>
>
> On 2020/12/23 19:28, Masahiko Sawada wrote:
> > On Tue, Dec 22, 2020 at 11:58 PM Fujii Masao
> >  wrote:
> >>
> >>
> >>
> >> On 2020/12/22 20:42, Fujii Masao wrote:
> >>>
> >>>
> >>> On 2020/12/22 10:25, Masahiko Sawada wrote:
>  On Fri, Dec 18, 2020 at 6:36 PM Fujii Masao 
>   wrote:
> >
> >
> >
> > On 2020/12/17 2:15, Fujii Masao wrote:
> >>
> >>
> >> On 2020/12/16 23:28, Drouvot, Bertrand wrote:
> >>> Hi,
> >>>
> >>> On 12/16/20 2:36 PM, Victor Yegorov wrote:
> 
>  *CAUTION*: This email originated from outside of the organization. 
>  Do not click links or open attachments unless you can confirm the 
>  sender and know the content is safe.
> 
> 
>  ср, 16 дек. 2020 г. в 13:49, Fujii Masao 
>  mailto:masao.fu...@oss.nttdata.com>>:
> 
>    After doing this procedure, you can see the startup process 
>  and backend
>    wait for the table lock each other, i.e., deadlock. But this 
>  deadlock remains
>    even after deadlock_timeout passes.
> 
>    This seems a bug to me.
> 
> >>> +1
> >>>
> 
>    > * Deadlocks involving the Startup process and an ordinary 
>  backend process
>    > * will be detected by the deadlock detector within the 
>  ordinary backend.
> 
>    The cause of this issue seems that 
>  ResolveRecoveryConflictWithLock() that
>    the startup process calls when recovery conflict on lock 
>  happens doesn't
>    take care of deadlock case at all. You can see this fact by 
>  reading the above
>    source code comment for ResolveRecoveryConflictWithLock().
> 
>    To fix this issue, I think that we should enable 
>  STANDBY_DEADLOCK_TIMEOUT
>    timer in ResolveRecoveryConflictWithLock() so that the startup 
>  process can
>    send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the 
>  backend.
>    Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal 
>  arrives,
>    the backend should check whether the deadlock actually happens 
>  or not.
>    Attached is the POC patch implimenting this.
> 
> >>> good catch!
> >>>
> >>> I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT 
> >>> shouldn't be set in ResolveRecoveryConflictWithLock() too (it is 
> >>> already set in ResolveRecoveryConflictWithBufferPin()).
> >>>
> >>> So + 1 to consider this as a bug and for the way the patch proposes 
> >>> to fix it.
> >>
> >> Thanks Victor and Bertrand for agreeing!
> >> Attached is the updated version of the patch.
> >
> > Attached is v3 of the patch. Could you review this version?
> >
> > While the startup process is waiting for recovery conflict on buffer 
> > pin,
> > it repeats sending the request for deadlock check to all the backends
> > every deadlock_timeout. This may increase the workload in the startup
> > process and backends, but since this is the original behavior, the patch
> > doesn't change that. Also in practice this may not be so harmful because
> > the period that the buffer is kept pinned is basically not so long.
> >
> 
>  @@ -529,6 +603,26 @@ ResolveRecoveryConflictWithBufferPin(void)
> */
>    ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
> 
>  +   if (got_standby_deadlock_timeout)
>  +   {
>  +   /*
>  +* Send out a request for hot-standby backends to check 
>  themselves for
>  +* deadlocks.
>  +*
>  +* XXX The subsequent ResolveRecoveryConflictWithBufferPin() 
>  will wait
>  +* to be signaled by UnpinBuffer() again and send a request for
>  +* deadlocks check if deadlock_timeout happens. This causes the
>  +* request to continue to be sent every deadlock_timeout until 
>  the
>  +* buffer is unpinned or ltime is reached. This would increase 
>  the
>  +* workload in the startup process and backends. In practice it 
>  may
>  +* not be so harmful because the period that the buffer is kept 
>  pinned
>  +* is basically no so long. But we should fix this?
>  +*/
>  +   SendRecoveryConflictWithBufferPin(
>  +
>  PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);
>  +   got_standby_deadlock_timeout = false;
>  +   }
>  +
> 
>  Since SendRecoveryConflictWithBufferPin() sends the signal to all
>  backends every backend who is waiting on a lock at ProcSleep() and not
>  holding a buffer pin blocking the startup 

Re: Deadlock between backend and recovery may not be detected

2020-12-23 Thread Fujii Masao



On 2020/12/23 19:28, Masahiko Sawada wrote:

On Tue, Dec 22, 2020 at 11:58 PM Fujii Masao
 wrote:




On 2020/12/22 20:42, Fujii Masao wrote:



On 2020/12/22 10:25, Masahiko Sawada wrote:

On Fri, Dec 18, 2020 at 6:36 PM Fujii Masao  wrote:




On 2020/12/17 2:15, Fujii Masao wrote:



On 2020/12/16 23:28, Drouvot, Bertrand wrote:

Hi,

On 12/16/20 2:36 PM, Victor Yegorov wrote:


*CAUTION*: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


ср, 16 дек. 2020 г. в 13:49, Fujii Masao mailto:masao.fu...@oss.nttdata.com>>:

  After doing this procedure, you can see the startup process and backend
  wait for the table lock each other, i.e., deadlock. But this deadlock 
remains
  even after deadlock_timeout passes.

  This seems a bug to me.


+1



  > * Deadlocks involving the Startup process and an ordinary backend 
process
  > * will be detected by the deadlock detector within the ordinary backend.

  The cause of this issue seems that ResolveRecoveryConflictWithLock() that
  the startup process calls when recovery conflict on lock happens doesn't
  take care of deadlock case at all. You can see this fact by reading the 
above
  source code comment for ResolveRecoveryConflictWithLock().

  To fix this issue, I think that we should enable STANDBY_DEADLOCK_TIMEOUT
  timer in ResolveRecoveryConflictWithLock() so that the startup process can
  send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the backend.
  Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
  the backend should check whether the deadlock actually happens or not.
  Attached is the POC patch implimenting this.


good catch!

I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT shouldn't be 
set in ResolveRecoveryConflictWithLock() too (it is already set in 
ResolveRecoveryConflictWithBufferPin()).

So + 1 to consider this as a bug and for the way the patch proposes to fix it.


Thanks Victor and Bertrand for agreeing!
Attached is the updated version of the patch.


Attached is v3 of the patch. Could you review this version?

While the startup process is waiting for recovery conflict on buffer pin,
it repeats sending the request for deadlock check to all the backends
every deadlock_timeout. This may increase the workload in the startup
process and backends, but since this is the original behavior, the patch
doesn't change that. Also in practice this may not be so harmful because
the period that the buffer is kept pinned is basically not so long.



@@ -529,6 +603,26 @@ ResolveRecoveryConflictWithBufferPin(void)
   */
  ProcWaitForSignal(PG_WAIT_BUFFER_PIN);

+   if (got_standby_deadlock_timeout)
+   {
+   /*
+* Send out a request for hot-standby backends to check themselves for
+* deadlocks.
+*
+* XXX The subsequent ResolveRecoveryConflictWithBufferPin() will wait
+* to be signaled by UnpinBuffer() again and send a request for
+* deadlocks check if deadlock_timeout happens. This causes the
+* request to continue to be sent every deadlock_timeout until the
+* buffer is unpinned or ltime is reached. This would increase the
+* workload in the startup process and backends. In practice it may
+* not be so harmful because the period that the buffer is kept pinned
+* is basically no so long. But we should fix this?
+*/
+   SendRecoveryConflictWithBufferPin(
+
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);
+   got_standby_deadlock_timeout = false;
+   }
+

Since SendRecoveryConflictWithBufferPin() sends the signal to all
backends every backend who is waiting on a lock at ProcSleep() and not
holding a buffer pin blocking the startup process will end up doing a
deadlock check, which seems expensive. What is worse is that the
deadlock will not be detected because the deadlock involving a buffer
pin isn't detected by CheckDeadLock(). I thought we can replace
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK with
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN but it’s not good because the
backend who has a buffer pin blocking the startup process and not
waiting on a lock is also canceled after deadlock_timeout. We can have
the backend return from RecoveryConflictInterrupt() when it received
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN and is not waiting on any lock,
but it’s also not good because we cannot cancel the backend after
max_standby_streaming_delay that has a buffer pin blocking the startup
process. So I wonder if we can have a new signal. When the backend
received this signal it returns from RecoveryConflictInterrupt()
without deadlock check either if it’s not waiting on any lock or if it
doesn’t have a buffer pin blocking the startup process. Otherwise it's
cancelled.


Thanks for pointing out that issue! Using new signal is an idea. Another idea
is 

Re: Deadlock between backend and recovery may not be detected

2020-12-23 Thread Masahiko Sawada
On Tue, Dec 22, 2020 at 11:58 PM Fujii Masao
 wrote:
>
>
>
> On 2020/12/22 20:42, Fujii Masao wrote:
> >
> >
> > On 2020/12/22 10:25, Masahiko Sawada wrote:
> >> On Fri, Dec 18, 2020 at 6:36 PM Fujii Masao  
> >> wrote:
> >>>
> >>>
> >>>
> >>> On 2020/12/17 2:15, Fujii Masao wrote:
> 
> 
>  On 2020/12/16 23:28, Drouvot, Bertrand wrote:
> > Hi,
> >
> > On 12/16/20 2:36 PM, Victor Yegorov wrote:
> >>
> >> *CAUTION*: This email originated from outside of the organization. Do 
> >> not click links or open attachments unless you can confirm the sender 
> >> and know the content is safe.
> >>
> >>
> >> ср, 16 дек. 2020 г. в 13:49, Fujii Masao  >> >:
> >>
> >>  After doing this procedure, you can see the startup process and 
> >> backend
> >>  wait for the table lock each other, i.e., deadlock. But this 
> >> deadlock remains
> >>  even after deadlock_timeout passes.
> >>
> >>  This seems a bug to me.
> >>
> > +1
> >
> >>
> >>  > * Deadlocks involving the Startup process and an ordinary 
> >> backend process
> >>  > * will be detected by the deadlock detector within the ordinary 
> >> backend.
> >>
> >>  The cause of this issue seems that 
> >> ResolveRecoveryConflictWithLock() that
> >>  the startup process calls when recovery conflict on lock happens 
> >> doesn't
> >>  take care of deadlock case at all. You can see this fact by 
> >> reading the above
> >>  source code comment for ResolveRecoveryConflictWithLock().
> >>
> >>  To fix this issue, I think that we should enable 
> >> STANDBY_DEADLOCK_TIMEOUT
> >>  timer in ResolveRecoveryConflictWithLock() so that the startup 
> >> process can
> >>  send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the 
> >> backend.
> >>  Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
> >>  the backend should check whether the deadlock actually happens or 
> >> not.
> >>  Attached is the POC patch implimenting this.
> >>
> > good catch!
> >
> > I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT 
> > shouldn't be set in ResolveRecoveryConflictWithLock() too (it is 
> > already set in ResolveRecoveryConflictWithBufferPin()).
> >
> > So + 1 to consider this as a bug and for the way the patch proposes to 
> > fix it.
> 
>  Thanks Victor and Bertrand for agreeing!
>  Attached is the updated version of the patch.
> >>>
> >>> Attached is v3 of the patch. Could you review this version?
> >>>
> >>> While the startup process is waiting for recovery conflict on buffer pin,
> >>> it repeats sending the request for deadlock check to all the backends
> >>> every deadlock_timeout. This may increase the workload in the startup
> >>> process and backends, but since this is the original behavior, the patch
> >>> doesn't change that. Also in practice this may not be so harmful because
> >>> the period that the buffer is kept pinned is basically not so long.
> >>>
> >>
> >> @@ -529,6 +603,26 @@ ResolveRecoveryConflictWithBufferPin(void)
> >>   */
> >>  ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
> >>
> >> +   if (got_standby_deadlock_timeout)
> >> +   {
> >> +   /*
> >> +* Send out a request for hot-standby backends to check themselves 
> >> for
> >> +* deadlocks.
> >> +*
> >> +* XXX The subsequent ResolveRecoveryConflictWithBufferPin() will 
> >> wait
> >> +* to be signaled by UnpinBuffer() again and send a request for
> >> +* deadlocks check if deadlock_timeout happens. This causes the
> >> +* request to continue to be sent every deadlock_timeout until the
> >> +* buffer is unpinned or ltime is reached. This would increase the
> >> +* workload in the startup process and backends. In practice it may
> >> +* not be so harmful because the period that the buffer is kept 
> >> pinned
> >> +* is basically no so long. But we should fix this?
> >> +*/
> >> +   SendRecoveryConflictWithBufferPin(
> >> +
> >> PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);
> >> +   got_standby_deadlock_timeout = false;
> >> +   }
> >> +
> >>
> >> Since SendRecoveryConflictWithBufferPin() sends the signal to all
> >> backends every backend who is waiting on a lock at ProcSleep() and not
> >> holding a buffer pin blocking the startup process will end up doing a
> >> deadlock check, which seems expensive. What is worse is that the
> >> deadlock will not be detected because the deadlock involving a buffer
> >> pin isn't detected by CheckDeadLock(). I thought we can replace
> >> PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK with
> >> PROCSIG_RECOVERY_CONFLICT_BUFFERPIN but it’s not good because the
> >> backend who has a buffer pin blocking the 

Re: Deadlock between backend and recovery may not be detected

2020-12-22 Thread Fujii Masao



On 2020/12/22 20:42, Fujii Masao wrote:



On 2020/12/22 10:25, Masahiko Sawada wrote:

On Fri, Dec 18, 2020 at 6:36 PM Fujii Masao  wrote:




On 2020/12/17 2:15, Fujii Masao wrote:



On 2020/12/16 23:28, Drouvot, Bertrand wrote:

Hi,

On 12/16/20 2:36 PM, Victor Yegorov wrote:


*CAUTION*: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


ср, 16 дек. 2020 г. в 13:49, Fujii Masao mailto:masao.fu...@oss.nttdata.com>>:

 After doing this procedure, you can see the startup process and backend
 wait for the table lock each other, i.e., deadlock. But this deadlock 
remains
 even after deadlock_timeout passes.

 This seems a bug to me.


+1



 > * Deadlocks involving the Startup process and an ordinary backend process
 > * will be detected by the deadlock detector within the ordinary backend.

 The cause of this issue seems that ResolveRecoveryConflictWithLock() that
 the startup process calls when recovery conflict on lock happens doesn't
 take care of deadlock case at all. You can see this fact by reading the 
above
 source code comment for ResolveRecoveryConflictWithLock().

 To fix this issue, I think that we should enable STANDBY_DEADLOCK_TIMEOUT
 timer in ResolveRecoveryConflictWithLock() so that the startup process can
 send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the backend.
 Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
 the backend should check whether the deadlock actually happens or not.
 Attached is the POC patch implimenting this.


good catch!

I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT shouldn't be 
set in ResolveRecoveryConflictWithLock() too (it is already set in 
ResolveRecoveryConflictWithBufferPin()).

So + 1 to consider this as a bug and for the way the patch proposes to fix it.


Thanks Victor and Bertrand for agreeing!
Attached is the updated version of the patch.


Attached is v3 of the patch. Could you review this version?

While the startup process is waiting for recovery conflict on buffer pin,
it repeats sending the request for deadlock check to all the backends
every deadlock_timeout. This may increase the workload in the startup
process and backends, but since this is the original behavior, the patch
doesn't change that. Also in practice this may not be so harmful because
the period that the buffer is kept pinned is basically not so long.



@@ -529,6 +603,26 @@ ResolveRecoveryConflictWithBufferPin(void)
  */
 ProcWaitForSignal(PG_WAIT_BUFFER_PIN);

+   if (got_standby_deadlock_timeout)
+   {
+   /*
+    * Send out a request for hot-standby backends to check themselves for
+    * deadlocks.
+    *
+    * XXX The subsequent ResolveRecoveryConflictWithBufferPin() will wait
+    * to be signaled by UnpinBuffer() again and send a request for
+    * deadlocks check if deadlock_timeout happens. This causes the
+    * request to continue to be sent every deadlock_timeout until the
+    * buffer is unpinned or ltime is reached. This would increase the
+    * workload in the startup process and backends. In practice it may
+    * not be so harmful because the period that the buffer is kept pinned
+    * is basically no so long. But we should fix this?
+    */
+   SendRecoveryConflictWithBufferPin(
+
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);
+   got_standby_deadlock_timeout = false;
+   }
+

Since SendRecoveryConflictWithBufferPin() sends the signal to all
backends every backend who is waiting on a lock at ProcSleep() and not
holding a buffer pin blocking the startup process will end up doing a
deadlock check, which seems expensive. What is worse is that the
deadlock will not be detected because the deadlock involving a buffer
pin isn't detected by CheckDeadLock(). I thought we can replace
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK with
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN but it’s not good because the
backend who has a buffer pin blocking the startup process and not
waiting on a lock is also canceled after deadlock_timeout. We can have
the backend return from RecoveryConflictInterrupt() when it received
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN and is not waiting on any lock,
but it’s also not good because we cannot cancel the backend after
max_standby_streaming_delay that has a buffer pin blocking the startup
process. So I wonder if we can have a new signal. When the backend
received this signal it returns from RecoveryConflictInterrupt()
without deadlock check either if it’s not waiting on any lock or if it
doesn’t have a buffer pin blocking the startup process. Otherwise it's
cancelled.


Thanks for pointing out that issue! Using new signal is an idea. Another idea
is to make a backend skip check the deadlock if GetStartupBufferPinWaitBufId()
returns -1, i.e., the startup process is not 

Re: Deadlock between backend and recovery may not be detected

2020-12-22 Thread Fujii Masao




On 2020/12/22 10:25, Masahiko Sawada wrote:

On Fri, Dec 18, 2020 at 6:36 PM Fujii Masao  wrote:




On 2020/12/17 2:15, Fujii Masao wrote:



On 2020/12/16 23:28, Drouvot, Bertrand wrote:

Hi,

On 12/16/20 2:36 PM, Victor Yegorov wrote:


*CAUTION*: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


ср, 16 дек. 2020 г. в 13:49, Fujii Masao mailto:masao.fu...@oss.nttdata.com>>:

 After doing this procedure, you can see the startup process and backend
 wait for the table lock each other, i.e., deadlock. But this deadlock 
remains
 even after deadlock_timeout passes.

 This seems a bug to me.


+1



 > * Deadlocks involving the Startup process and an ordinary backend process
 > * will be detected by the deadlock detector within the ordinary backend.

 The cause of this issue seems that ResolveRecoveryConflictWithLock() that
 the startup process calls when recovery conflict on lock happens doesn't
 take care of deadlock case at all. You can see this fact by reading the 
above
 source code comment for ResolveRecoveryConflictWithLock().

 To fix this issue, I think that we should enable STANDBY_DEADLOCK_TIMEOUT
 timer in ResolveRecoveryConflictWithLock() so that the startup process can
 send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the backend.
 Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
 the backend should check whether the deadlock actually happens or not.
 Attached is the POC patch implimenting this.


good catch!

I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT shouldn't be 
set in ResolveRecoveryConflictWithLock() too (it is already set in 
ResolveRecoveryConflictWithBufferPin()).

So + 1 to consider this as a bug and for the way the patch proposes to fix it.


Thanks Victor and Bertrand for agreeing!
Attached is the updated version of the patch.


Attached is v3 of the patch. Could you review this version?

While the startup process is waiting for recovery conflict on buffer pin,
it repeats sending the request for deadlock check to all the backends
every deadlock_timeout. This may increase the workload in the startup
process and backends, but since this is the original behavior, the patch
doesn't change that. Also in practice this may not be so harmful because
the period that the buffer is kept pinned is basically not so long.



@@ -529,6 +603,26 @@ ResolveRecoveryConflictWithBufferPin(void)
  */
 ProcWaitForSignal(PG_WAIT_BUFFER_PIN);

+   if (got_standby_deadlock_timeout)
+   {
+   /*
+* Send out a request for hot-standby backends to check themselves for
+* deadlocks.
+*
+* XXX The subsequent ResolveRecoveryConflictWithBufferPin() will wait
+* to be signaled by UnpinBuffer() again and send a request for
+* deadlocks check if deadlock_timeout happens. This causes the
+* request to continue to be sent every deadlock_timeout until the
+* buffer is unpinned or ltime is reached. This would increase the
+* workload in the startup process and backends. In practice it may
+* not be so harmful because the period that the buffer is kept pinned
+* is basically no so long. But we should fix this?
+*/
+   SendRecoveryConflictWithBufferPin(
+
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);
+   got_standby_deadlock_timeout = false;
+   }
+

Since SendRecoveryConflictWithBufferPin() sends the signal to all
backends every backend who is waiting on a lock at ProcSleep() and not
holding a buffer pin blocking the startup process will end up doing a
deadlock check, which seems expensive. What is worse is that the
deadlock will not be detected because the deadlock involving a buffer
pin isn't detected by CheckDeadLock(). I thought we can replace
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK with
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN but it’s not good because the
backend who has a buffer pin blocking the startup process and not
waiting on a lock is also canceled after deadlock_timeout. We can have
the backend return from RecoveryConflictInterrupt() when it received
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN and is not waiting on any lock,
but it’s also not good because we cannot cancel the backend after
max_standby_streaming_delay that has a buffer pin blocking the startup
process. So I wonder if we can have a new signal. When the backend
received this signal it returns from RecoveryConflictInterrupt()
without deadlock check either if it’s not waiting on any lock or if it
doesn’t have a buffer pin blocking the startup process. Otherwise it's
cancelled.


Thanks for pointing out that issue! Using new signal is an idea. Another idea
is to make a backend skip check the deadlock if GetStartupBufferPinWaitBufId()
returns -1, i.e., the startup process is not waiting for buffer pin. So,
what I'm 

Re: Deadlock between backend and recovery may not be detected

2020-12-22 Thread Fujii Masao




On 2020/12/19 1:43, Drouvot, Bertrand wrote:

Hi,

On 12/18/20 10:35 AM, Fujii Masao wrote:

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



On 2020/12/17 2:15, Fujii Masao wrote:



On 2020/12/16 23:28, Drouvot, Bertrand wrote:

Hi,

On 12/16/20 2:36 PM, Victor Yegorov wrote:


*CAUTION*: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


ср, 16 дек. 2020 г. в 13:49, Fujii Masao mailto:masao.fu...@oss.nttdata.com>>:

    After doing this procedure, you can see the startup process and backend
    wait for the table lock each other, i.e., deadlock. But this deadlock 
remains
    even after deadlock_timeout passes.

    This seems a bug to me.


+1



    > * Deadlocks involving the Startup process and an ordinary backend process
    > * will be detected by the deadlock detector within the ordinary backend.

    The cause of this issue seems that ResolveRecoveryConflictWithLock() that
    the startup process calls when recovery conflict on lock happens doesn't
    take care of deadlock case at all. You can see this fact by reading the 
above
    source code comment for ResolveRecoveryConflictWithLock().

    To fix this issue, I think that we should enable STANDBY_DEADLOCK_TIMEOUT
    timer in ResolveRecoveryConflictWithLock() so that the startup process can
    send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the backend.
    Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
    the backend should check whether the deadlock actually happens or not.
    Attached is the POC patch implimenting this.


good catch!

I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT shouldn't be 
set in ResolveRecoveryConflictWithLock() too (it is already set in 
ResolveRecoveryConflictWithBufferPin()).

So + 1 to consider this as a bug and for the way the patch proposes to fix it.


Thanks Victor and Bertrand for agreeing!
Attached is the updated version of the patch.


Attached is v3 of the patch. Could you review this version?

While the startup process is waiting for recovery conflict on buffer pin,
it repeats sending the request for deadlock check to all the backends
every deadlock_timeout. This may increase the workload in the startup
process and backends, but since this is the original behavior, the patch
doesn't change that. 


Agree.

IMHO that may need to be rethink (as we are already in a conflict situation, i 
am tempted to say that the less is being done the better it is), but i think 
that's outside the scope of this patch.


Yes, I agree that's better. I think that we should improve that as a separate
patch only for master branch, after fixing the bug and back-patching that
at first.





Also in practice this may not be so harmful because
the period that the buffer is kept pinned is basically not so long.


Agree that chances are less to be in this mode for a "long" duration (as 
compare to the lock conflict).



On the other hand, IMO we should avoid this issue while the startup
process is waiting for recovery conflict on locks since it can take
a long time to release the locks. So the patch tries to fix it.

Agree


Or I'm overthinking this? If this doesn't need to be handled,
the patch can be simplified more. Thought?


I do think that's good to handle it that way for the lock conflict: the less 
work is done the better it is (specially in a conflict situation).

The patch does look good to me.


Thanks for the review!

Regards,


--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Deadlock between backend and recovery may not be detected

2020-12-21 Thread Masahiko Sawada
On Fri, Dec 18, 2020 at 6:36 PM Fujii Masao  wrote:
>
>
>
> On 2020/12/17 2:15, Fujii Masao wrote:
> >
> >
> > On 2020/12/16 23:28, Drouvot, Bertrand wrote:
> >> Hi,
> >>
> >> On 12/16/20 2:36 PM, Victor Yegorov wrote:
> >>>
> >>> *CAUTION*: This email originated from outside of the organization. Do not 
> >>> click links or open attachments unless you can confirm the sender and 
> >>> know the content is safe.
> >>>
> >>>
> >>> ср, 16 дек. 2020 г. в 13:49, Fujii Masao  >>> >:
> >>>
> >>> After doing this procedure, you can see the startup process and 
> >>> backend
> >>> wait for the table lock each other, i.e., deadlock. But this deadlock 
> >>> remains
> >>> even after deadlock_timeout passes.
> >>>
> >>> This seems a bug to me.
> >>>
> >> +1
> >>
> >>>
> >>> > * Deadlocks involving the Startup process and an ordinary backend 
> >>> process
> >>> > * will be detected by the deadlock detector within the ordinary 
> >>> backend.
> >>>
> >>> The cause of this issue seems that ResolveRecoveryConflictWithLock() 
> >>> that
> >>> the startup process calls when recovery conflict on lock happens 
> >>> doesn't
> >>> take care of deadlock case at all. You can see this fact by reading 
> >>> the above
> >>> source code comment for ResolveRecoveryConflictWithLock().
> >>>
> >>> To fix this issue, I think that we should enable 
> >>> STANDBY_DEADLOCK_TIMEOUT
> >>> timer in ResolveRecoveryConflictWithLock() so that the startup 
> >>> process can
> >>> send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the backend.
> >>> Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
> >>> the backend should check whether the deadlock actually happens or not.
> >>> Attached is the POC patch implimenting this.
> >>>
> >> good catch!
> >>
> >> I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT shouldn't 
> >> be set in ResolveRecoveryConflictWithLock() too (it is already set in 
> >> ResolveRecoveryConflictWithBufferPin()).
> >>
> >> So + 1 to consider this as a bug and for the way the patch proposes to fix 
> >> it.
> >
> > Thanks Victor and Bertrand for agreeing!
> > Attached is the updated version of the patch.
>
> Attached is v3 of the patch. Could you review this version?
>
> While the startup process is waiting for recovery conflict on buffer pin,
> it repeats sending the request for deadlock check to all the backends
> every deadlock_timeout. This may increase the workload in the startup
> process and backends, but since this is the original behavior, the patch
> doesn't change that. Also in practice this may not be so harmful because
> the period that the buffer is kept pinned is basically not so long.
>

@@ -529,6 +603,26 @@ ResolveRecoveryConflictWithBufferPin(void)
 */
ProcWaitForSignal(PG_WAIT_BUFFER_PIN);

+   if (got_standby_deadlock_timeout)
+   {
+   /*
+* Send out a request for hot-standby backends to check themselves for
+* deadlocks.
+*
+* XXX The subsequent ResolveRecoveryConflictWithBufferPin() will wait
+* to be signaled by UnpinBuffer() again and send a request for
+* deadlocks check if deadlock_timeout happens. This causes the
+* request to continue to be sent every deadlock_timeout until the
+* buffer is unpinned or ltime is reached. This would increase the
+* workload in the startup process and backends. In practice it may
+* not be so harmful because the period that the buffer is kept pinned
+* is basically no so long. But we should fix this?
+*/
+   SendRecoveryConflictWithBufferPin(
+
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);
+   got_standby_deadlock_timeout = false;
+   }
+

Since SendRecoveryConflictWithBufferPin() sends the signal to all
backends every backend who is waiting on a lock at ProcSleep() and not
holding a buffer pin blocking the startup process will end up doing a
deadlock check, which seems expensive. What is worse is that the
deadlock will not be detected because the deadlock involving a buffer
pin isn't detected by CheckDeadLock(). I thought we can replace
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK with
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN but it’s not good because the
backend who has a buffer pin blocking the startup process and not
waiting on a lock is also canceled after deadlock_timeout. We can have
the backend return from RecoveryConflictInterrupt() when it received
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN and is not waiting on any lock,
but it’s also not good because we cannot cancel the backend after
max_standby_streaming_delay that has a buffer pin blocking the startup
process. So I wonder if we can have a new signal. When the backend
received this signal it returns from RecoveryConflictInterrupt()
without deadlock check either if it’s not waiting on any lock or if it
doesn’t have a buffer pin blocking the startup process. 

Re: Deadlock between backend and recovery may not be detected

2020-12-18 Thread Fujii Masao



On 2020/12/17 2:15, Fujii Masao wrote:



On 2020/12/16 23:28, Drouvot, Bertrand wrote:

Hi,

On 12/16/20 2:36 PM, Victor Yegorov wrote:


*CAUTION*: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


ср, 16 дек. 2020 г. в 13:49, Fujii Masao mailto:masao.fu...@oss.nttdata.com>>:

    After doing this procedure, you can see the startup process and backend
    wait for the table lock each other, i.e., deadlock. But this deadlock 
remains
    even after deadlock_timeout passes.

    This seems a bug to me.


+1



    > * Deadlocks involving the Startup process and an ordinary backend process
    > * will be detected by the deadlock detector within the ordinary backend.

    The cause of this issue seems that ResolveRecoveryConflictWithLock() that
    the startup process calls when recovery conflict on lock happens doesn't
    take care of deadlock case at all. You can see this fact by reading the 
above
    source code comment for ResolveRecoveryConflictWithLock().

    To fix this issue, I think that we should enable STANDBY_DEADLOCK_TIMEOUT
    timer in ResolveRecoveryConflictWithLock() so that the startup process can
    send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the backend.
    Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
    the backend should check whether the deadlock actually happens or not.
    Attached is the POC patch implimenting this.


good catch!

I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT shouldn't be 
set in ResolveRecoveryConflictWithLock() too (it is already set in 
ResolveRecoveryConflictWithBufferPin()).

So + 1 to consider this as a bug and for the way the patch proposes to fix it.


Thanks Victor and Bertrand for agreeing!
Attached is the updated version of the patch.


Attached is v3 of the patch. Could you review this version?

While the startup process is waiting for recovery conflict on buffer pin,
it repeats sending the request for deadlock check to all the backends
every deadlock_timeout. This may increase the workload in the startup
process and backends, but since this is the original behavior, the patch
doesn't change that. Also in practice this may not be so harmful because
the period that the buffer is kept pinned is basically not so long.

On the other hand, IMO we should avoid this issue while the startup
process is waiting for recovery conflict on locks since it can take
a long time to release the locks. So the patch tries to fix it.

Or I'm overthinking this? If this doesn't need to be handled,
the patch can be simplified more. Thought?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/src/backend/storage/ipc/procarray.c 
b/src/backend/storage/ipc/procarray.c
index ee912b9d5e..7c1a3f8b3f 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3324,6 +3324,13 @@ GetConflictingVirtualXIDs(TransactionId limitXmin, Oid 
dbOid)
  */
 pid_t
 CancelVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode)
+{
+   return SignalVirtualTransaction(vxid, sigmode, true);
+}
+
+pid_t
+SignalVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode,
+bool conflictPending)
 {
ProcArrayStruct *arrayP = procArray;
int index;
@@ -3342,7 +3349,7 @@ CancelVirtualTransaction(VirtualTransactionId vxid, 
ProcSignalReason sigmode)
if (procvxid.backendId == vxid.backendId &&
procvxid.localTransactionId == vxid.localTransactionId)
{
-   proc->recoveryConflictPending = true;
+   proc->recoveryConflictPending = conflictPending;
pid = proc->pid;
if (pid != 0)
{
diff --git a/src/backend/storage/ipc/standby.c 
b/src/backend/storage/ipc/standby.c
index 92d9027776..3164aed423 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -42,6 +42,10 @@ int  max_standby_streaming_delay = 30 * 1000;
 
 static HTAB *RecoveryLockLists;
 
+/* Flags set by timeout handlers */
+static volatile sig_atomic_t got_standby_deadlock_timeout = false;
+static volatile sig_atomic_t got_standby_lock_timeout = false;
+
 static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId 
*waitlist,

   ProcSignalReason reason,

   uint32 wait_event_info,
@@ -395,8 +399,10 @@ ResolveRecoveryConflictWithDatabase(Oid dbid)
  * lock.  As we are already queued to be granted the lock, no new lock
  * requests conflicting 

Re: Deadlock between backend and recovery may not be detected

2020-12-16 Thread Fujii Masao



On 2020/12/16 23:28, Drouvot, Bertrand wrote:

Hi,

On 12/16/20 2:36 PM, Victor Yegorov wrote:


*CAUTION*: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


ср, 16 дек. 2020 г. в 13:49, Fujii Masao mailto:masao.fu...@oss.nttdata.com>>:

After doing this procedure, you can see the startup process and backend
wait for the table lock each other, i.e., deadlock. But this deadlock 
remains
even after deadlock_timeout passes.

This seems a bug to me.


+1



> * Deadlocks involving the Startup process and an ordinary backend process
> * will be detected by the deadlock detector within the ordinary backend.

The cause of this issue seems that ResolveRecoveryConflictWithLock() that
the startup process calls when recovery conflict on lock happens doesn't
take care of deadlock case at all. You can see this fact by reading the 
above
source code comment for ResolveRecoveryConflictWithLock().

To fix this issue, I think that we should enable STANDBY_DEADLOCK_TIMEOUT
timer in ResolveRecoveryConflictWithLock() so that the startup process can
send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the backend.
Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
the backend should check whether the deadlock actually happens or not.
Attached is the POC patch implimenting this.


good catch!

I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT shouldn't be 
set in ResolveRecoveryConflictWithLock() too (it is already set in 
ResolveRecoveryConflictWithBufferPin()).

So + 1 to consider this as a bug and for the way the patch proposes to fix it.


Thanks Victor and Bertrand for agreeing!
Attached is the updated version of the patch.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/src/backend/storage/ipc/procarray.c 
b/src/backend/storage/ipc/procarray.c
index ee912b9d5e..7c1a3f8b3f 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3324,6 +3324,13 @@ GetConflictingVirtualXIDs(TransactionId limitXmin, Oid 
dbOid)
  */
 pid_t
 CancelVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode)
+{
+   return SignalVirtualTransaction(vxid, sigmode, true);
+}
+
+pid_t
+SignalVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode,
+bool conflictPending)
 {
ProcArrayStruct *arrayP = procArray;
int index;
@@ -3342,7 +3349,7 @@ CancelVirtualTransaction(VirtualTransactionId vxid, 
ProcSignalReason sigmode)
if (procvxid.backendId == vxid.backendId &&
procvxid.localTransactionId == vxid.localTransactionId)
{
-   proc->recoveryConflictPending = true;
+   proc->recoveryConflictPending = conflictPending;
pid = proc->pid;
if (pid != 0)
{
diff --git a/src/backend/storage/ipc/standby.c 
b/src/backend/storage/ipc/standby.c
index 4ea3cf1f5c..4b7762644e 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -42,10 +42,15 @@ int max_standby_streaming_delay = 30 * 1000;
 
 static HTAB *RecoveryLockLists;
 
+/* Is a deadlock check pending? */
+static volatile sig_atomic_t got_standby_deadlock_timeout;
+
 static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId 
*waitlist,

   ProcSignalReason reason,

   uint32 wait_event_info,

   bool report_waiting);
+static void SendRecoveryConflictWithLock(ProcSignalReason reason,
+   
 LOCKTAG locktag);
 static void SendRecoveryConflictWithBufferPin(ProcSignalReason reason);
 static XLogRecPtr LogCurrentRunningXacts(RunningTransactions CurrRunningXacts);
 static void LogAccessExclusiveLocks(int nlocks, xl_standby_lock *locks);
@@ -395,8 +400,10 @@ ResolveRecoveryConflictWithDatabase(Oid dbid)
  * lock.  As we are already queued to be granted the lock, no new lock
  * requests conflicting with ours will be granted in the meantime.
  *
- * Deadlocks involving the Startup process and an ordinary backend process
- * will be detected by the deadlock detector within the ordinary backend.
+ * We also must check for deadlocks involving the Startup process and
+ * hot-standby backend processes. If deadlock_timeout is reached in
+ * this function, the backends are requested to check 

Re: Deadlock between backend and recovery may not be detected

2020-12-16 Thread Victor Yegorov
ср, 16 дек. 2020 г. в 13:49, Fujii Masao :

> After doing this procedure, you can see the startup process and backend
> wait for the table lock each other, i.e., deadlock. But this deadlock
> remains
> even after deadlock_timeout passes.
>
> This seems a bug to me.
>
> > * Deadlocks involving the Startup process and an ordinary backend process
> > * will be detected by the deadlock detector within the ordinary backend.
>
> The cause of this issue seems that ResolveRecoveryConflictWithLock() that
> the startup process calls when recovery conflict on lock happens doesn't
> take care of deadlock case at all. You can see this fact by reading the
> above
> source code comment for ResolveRecoveryConflictWithLock().
>
> To fix this issue, I think that we should enable STANDBY_DEADLOCK_TIMEOUT
> timer in ResolveRecoveryConflictWithLock() so that the startup process can
> send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the backend.
> Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
> the backend should check whether the deadlock actually happens or not.
> Attached is the POC patch implimenting this.
>

I agree that this is a bug.

Unfortunately, we've been hit by it in production.
Such deadlock will, eventually, make all sessions wait on the startup
process, making
streaming replica unusable. In case replica is used for balancing out RO
queries from the primary,
it causes downtime for the project.

If I understand things right, session will release it's locks
when max_standby_streaming_delay is reached.
But it'd be much better if conflict is resolved faster,
around deadlock_timeout.

So — huge +1 from me for fixing it.


-- 
Victor Yegorov


Deadlock between backend and recovery may not be detected

2020-12-16 Thread Fujii Masao

Hi,

While reviewing the patch proposed at [1], I found that there is the case
where deadlock that recovery conflict on lock is involved in may not be
detected. This deadlock can happen between backends and the startup
process, in the standby server. Please see the following procedure to
reproduce the deadlock.

#1. Set up streaming replication.

#2. Set max_standby_streaming_delay to -1 in the standby.

#3. Create two tables in the primary.

[PRIMARY: SESSION1]
CREATE TABLE t1 ();
CREATE TABLE t2 ();

#4. Start transaction and access to the table t1.

[STANDBY: SESSION2]
BEGIN;
SELECT * FROM t1;

#5. Start transaction and lock table t2 in access exclusive mode,
in the primary. Also execute pg_switch_wal() to transfer WAL record
for access exclusive lock to the standby.

[PRIMARY: SESSION1]
BEGIN;
LOCK TABLE t2 IN ACCESS EXCLUSIVE MODE;
SELECT pg_switch_wal();

#6. Access to the table t2 within the transaction that started at #4,
in the standby.

[STANDBY: SESSION2]
SELECT * FROM t2;

#7. Lock table t1 in access exclusive mode within the transaction that
started in #5, in the primary. Also execute pg_switch_wal() to transfer
WAL record for access exclusive lock to the standby.

[PRIMARY: SESSION1]
LOCK TABLE t1 IN ACCESS EXCLUSIVE MODE;
SELECT pg_switch_wal();

After doing this procedure, you can see the startup process and backend
wait for the table lock each other, i.e., deadlock. But this deadlock remains
even after deadlock_timeout passes.

This seems a bug to me.


* Deadlocks involving the Startup process and an ordinary backend process
* will be detected by the deadlock detector within the ordinary backend.


The cause of this issue seems that ResolveRecoveryConflictWithLock() that
the startup process calls when recovery conflict on lock happens doesn't
take care of deadlock case at all. You can see this fact by reading the above
source code comment for ResolveRecoveryConflictWithLock().

To fix this issue, I think that we should enable STANDBY_DEADLOCK_TIMEOUT
timer in ResolveRecoveryConflictWithLock() so that the startup process can
send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the backend.
Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
the backend should check whether the deadlock actually happens or not.
Attached is the POC patch implimenting this.

Thought?

Regards,

[1] https://postgr.es/m/9a60178c-a853-1440-2cdc-c3af916cf...@amazon.com

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/src/backend/storage/ipc/standby.c 
b/src/backend/storage/ipc/standby.c
index 4ea3cf1f5c..c398f9a3a5 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -407,7 +407,7 @@ ResolveRecoveryConflictWithLock(LOCKTAG locktag)
 
ltime = GetStandbyLimitTime();
 
-   if (GetCurrentTimestamp() >= ltime)
+   if (GetCurrentTimestamp() >= ltime && ltime > 0)
{
/*
 * We're already behind, so clear a path as quickly as possible.
@@ -431,12 +431,21 @@ ResolveRecoveryConflictWithLock(LOCKTAG locktag)
/*
 * Wait (or wait again) until ltime
 */
-   EnableTimeoutParams timeouts[1];
+   EnableTimeoutParams timeouts[2];
+   int i = 0;
 
-   timeouts[0].id = STANDBY_LOCK_TIMEOUT;
-   timeouts[0].type = TMPARAM_AT;
-   timeouts[0].fin_time = ltime;
-   enable_timeouts(timeouts, 1);
+   if (ltime > 0)
+   {
+   timeouts[i].id = STANDBY_LOCK_TIMEOUT;
+   timeouts[i].type = TMPARAM_AT;
+   timeouts[i].fin_time = ltime;
+   i++;
+   }
+   timeouts[i].id = STANDBY_DEADLOCK_TIMEOUT;
+   timeouts[i].type = TMPARAM_AFTER;
+   timeouts[i].delay_ms = DeadlockTimeout;
+   i++;
+   enable_timeouts(timeouts, i);
}
 
/* Wait to be signaled by the release of the Relation Lock */
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3679799e50..3717381a6a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2923,7 +2923,11 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
 * more to do.
 */
if (!HoldingBufferPinThatDelaysRecovery())
+   {
+   if (reason == 
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK)
+   CheckDeadLockAlert();
return;
+   }
 
MyProc->recoveryConflictPending = true;