On Wed, Dec 23, 2020 at 9:42 PM Fujii Masao <masao.fu...@oss.nttdata.com> wrote: > > > > On 2020/12/23 19:28, Masahiko Sawada wrote: > > On Tue, Dec 22, 2020 at 11:58 PM Fujii Masao > > <masao.fu...@oss.nttdata.com> wrote: > >> > >> > >> > >> On 2020/12/22 20:42, Fujii Masao wrote: > >>> > >>> > >>> On 2020/12/22 10:25, Masahiko Sawada wrote: > >>>> On Fri, Dec 18, 2020 at 6:36 PM Fujii Masao > >>>> <masao.fu...@oss.nttdata.com> wrote: > >>>>> > >>>>> > >>>>> > >>>>> On 2020/12/17 2:15, Fujii Masao wrote: > >>>>>> > >>>>>> > >>>>>> On 2020/12/16 23:28, Drouvot, Bertrand wrote: > >>>>>>> Hi, > >>>>>>> > >>>>>>> On 12/16/20 2:36 PM, Victor Yegorov wrote: > >>>>>>>> > >>>>>>>> *CAUTION*: This email originated from outside of the organization. > >>>>>>>> Do not click links or open attachments unless you can confirm the > >>>>>>>> sender and know the content is safe. > >>>>>>>> > >>>>>>>> > >>>>>>>> ср, 16 дек. 2020 г. в 13:49, Fujii Masao > >>>>>>>> <masao.fu...@oss.nttdata.com <mailto:masao.fu...@oss.nttdata.com>>: > >>>>>>>> > >>>>>>>> After doing this procedure, you can see the startup process > >>>>>>>> and backend > >>>>>>>> wait for the table lock each other, i.e., deadlock. But this > >>>>>>>> deadlock remains > >>>>>>>> even after deadlock_timeout passes. > >>>>>>>> > >>>>>>>> This seems a bug to me. > >>>>>>>> > >>>>>>> +1 > >>>>>>> > >>>>>>>> > >>>>>>>> > * Deadlocks involving the Startup process and an ordinary > >>>>>>>> backend process > >>>>>>>> > * will be detected by the deadlock detector within the > >>>>>>>> ordinary backend. > >>>>>>>> > >>>>>>>> The cause of this issue seems that > >>>>>>>> ResolveRecoveryConflictWithLock() that > >>>>>>>> the startup process calls when recovery conflict on lock > >>>>>>>> happens doesn't > >>>>>>>> take care of deadlock case at all. You can see this fact by > >>>>>>>> reading the above > >>>>>>>> source code comment for ResolveRecoveryConflictWithLock(). > >>>>>>>> > >>>>>>>> To fix this issue, I think that we should enable > >>>>>>>> STANDBY_DEADLOCK_TIMEOUT > >>>>>>>> timer in ResolveRecoveryConflictWithLock() so that the startup > >>>>>>>> process can > >>>>>>>> send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the > >>>>>>>> backend. > >>>>>>>> Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal > >>>>>>>> arrives, > >>>>>>>> the backend should check whether the deadlock actually happens > >>>>>>>> or not. > >>>>>>>> Attached is the POC patch implimenting this. > >>>>>>>> > >>>>>>> good catch! > >>>>>>> > >>>>>>> I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT > >>>>>>> shouldn't be set in ResolveRecoveryConflictWithLock() too (it is > >>>>>>> already set in ResolveRecoveryConflictWithBufferPin()). > >>>>>>> > >>>>>>> So + 1 to consider this as a bug and for the way the patch proposes > >>>>>>> to fix it. > >>>>>> > >>>>>> Thanks Victor and Bertrand for agreeing! > >>>>>> Attached is the updated version of the patch. > >>>>> > >>>>> Attached is v3 of the patch. Could you review this version? > >>>>> > >>>>> While the startup process is waiting for recovery conflict on buffer > >>>>> pin, > >>>>> it repeats sending the request for deadlock check to all the backends > >>>>> every deadlock_timeout. This may increase the workload in the startup > >>>>> process and backends, but since this is the original behavior, the patch > >>>>> doesn't change that. Also in practice this may not be so harmful because > >>>>> the period that the buffer is kept pinned is basically not so long. > >>>>> > >>>> > >>>> @@ -529,6 +603,26 @@ ResolveRecoveryConflictWithBufferPin(void) > >>>> */ > >>>> ProcWaitForSignal(PG_WAIT_BUFFER_PIN); > >>>> > >>>> + if (got_standby_deadlock_timeout) > >>>> + { > >>>> + /* > >>>> + * Send out a request for hot-standby backends to check > >>>> themselves for > >>>> + * deadlocks. > >>>> + * > >>>> + * XXX The subsequent ResolveRecoveryConflictWithBufferPin() > >>>> will wait > >>>> + * to be signaled by UnpinBuffer() again and send a request for > >>>> + * deadlocks check if deadlock_timeout happens. This causes the > >>>> + * request to continue to be sent every deadlock_timeout until > >>>> the > >>>> + * buffer is unpinned or ltime is reached. This would increase > >>>> the > >>>> + * workload in the startup process and backends. In practice it > >>>> may > >>>> + * not be so harmful because the period that the buffer is kept > >>>> pinned > >>>> + * is basically no so long. But we should fix this? > >>>> + */ > >>>> + SendRecoveryConflictWithBufferPin( > >>>> + > >>>> PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK); > >>>> + got_standby_deadlock_timeout = false; > >>>> + } > >>>> + > >>>> > >>>> Since SendRecoveryConflictWithBufferPin() sends the signal to all > >>>> backends every backend who is waiting on a lock at ProcSleep() and not > >>>> holding a buffer pin blocking the startup process will end up doing a > >>>> deadlock check, which seems expensive. What is worse is that the > >>>> deadlock will not be detected because the deadlock involving a buffer > >>>> pin isn't detected by CheckDeadLock(). I thought we can replace > >>>> PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK with > >>>> PROCSIG_RECOVERY_CONFLICT_BUFFERPIN but it’s not good because the > >>>> backend who has a buffer pin blocking the startup process and not > >>>> waiting on a lock is also canceled after deadlock_timeout. We can have > >>>> the backend return from RecoveryConflictInterrupt() when it received > >>>> PROCSIG_RECOVERY_CONFLICT_BUFFERPIN and is not waiting on any lock, > >>>> but it’s also not good because we cannot cancel the backend after > >>>> max_standby_streaming_delay that has a buffer pin blocking the startup > >>>> process. So I wonder if we can have a new signal. When the backend > >>>> received this signal it returns from RecoveryConflictInterrupt() > >>>> without deadlock check either if it’s not waiting on any lock or if it > >>>> doesn’t have a buffer pin blocking the startup process. Otherwise it's > >>>> cancelled. > >>> > >>> Thanks for pointing out that issue! Using new signal is an idea. Another > >>> idea > >>> is to make a backend skip check the deadlock if > >>> GetStartupBufferPinWaitBufId() > >>> returns -1, i.e., the startup process is not waiting for buffer pin. So, > >>> what I'm thinkins is; > >>> > >>> In RecoveryConflictInterrupt(), when a backend receive > >>> PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK, > >>> > >>> 1. If a backend isn't waiting for a lock, it does nothing . > >>> 2. If a backend is waiting for a lock and also holding a buffer pin that > >>> delays recovery, it may be canceled. > >>> 3. If a backend is waiting for a lock and the startup process is not > >>> waiting > >>> for buffer pin (i.e., the startup process is also waiting for a > >>> lock), > >>> it checks for the deadlocks. > >>> 4. If a backend is waiting for a lock and isn't holding a buffer pin that > >>> delays recovery though the startup process is waiting for buffer pin, > >>> it does nothing. > >> > > > > Good idea! It could still happen that if the startup process sets > > startupBufferPinWaitBufId to -1 after sending the signal and before > > the backend checks it, the backend will end up doing an unmeaningful > > deadlock check. But the likelihood would be low in practice. > > > > I have two small comments on ResolveRecoveryConflictWithBufferPin() in > > the v4 patch: > > > > The code currently has three branches as follow: > > > > if (ltime == 0) > > { > > enable a timeout for deadlock; > > } > > else if (GetCurrentTimestamp() >= ltime) > > { > > send recovery conflict signal; > > } > > else > > { > > enable two timeouts: ltime and deadlock > > } > > > > I think we can rearrange the code similar to the changes you made on > > ResolveRecoveryConflictWithLock(): > > > > if (GetCurrentTimestamp() >= ltime && ltime != 0) > > { > > Resolve recovery conflict; > > } > > else > > { > > Enable one or two timeouts: ltime and deadlock > > } > > > > It's more consistent with ResolveRecoveryConflictWithLock(). And > > currently the patch doesn't reset got_standby_deadlock_timeout in > > (ltime == 0) case but it will also be resolved by this rearrangement. > > I didn't want to change the code structure as much as possible because > the patch needs to be back-patched. But it's good idea to make the code > structures in ResolveRecoveryConflictWithLock() and ...WithBufferPin() > similar, > to make the code simpler and easier-to-read. So I agree with you. Attached > is the updated of the patch. What about this version?
Thank you for updating the patch! The patch looks good to me. > > > > > --- > > If we always reset got_standby_deadlock_timeout before waiting it's > > not necessary but we might want to clear got_standby_deadlock_timeout > > also after disabling all timeouts to ensure that it's cleared at the > > end of the function. In ResolveRecoveryConflictWithLock() we clear > > both got_standby_lock_timeout and got_standby_deadlock_timeout after > > disabling all timeouts but we don't do that in > > ResolveRecoveryConflictWithBufferPin(). > > I agree that it's better to reset got_standby_deadlock_timeout after > all the timeouts are disabled. So I changed the patch that way. OTOH > got_standby_lock_timeout doesn't need to be reset because it's never > enabled in ResolveRecoveryConflictWithBufferPin(). No? Yes, you're right. We need to clear only got_standby_deadlock_timeout. Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/