On Sat, Apr 6, 2024 at 11:49 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Fri, Apr 5, 2024 at 10:31 AM shveta malik <shveta.ma...@gmail.com> wrote: > > > > Please find v2. Changes are: > > 1) Rebased the patch as there was conflict due to recent commit 6f132ed. > > 2) Added an Assert in update_synced_slots_inactive_since() to ensure > > that the slot does not have active_pid. > > 3) Improved commit msg and comments. > > > > Few comments: > ============== > 1. > void > SyncReplicationSlots(WalReceiverConn *wrconn) > { > + /* > + * Startup process signaled the slot sync to stop, so if meanwhile user > + * has invoked slot sync SQL function, simply return. > + */ > + SpinLockAcquire(&SlotSyncCtx->mutex); > + if (SlotSyncCtx->stopSignaled) > + { > + ereport(LOG, > + errmsg("skipping slot synchronization as slot sync shutdown is > signaled during promotion")); > + > + SpinLockRelease(&SlotSyncCtx->mutex); > + return; > + } > + SpinLockRelease(&SlotSyncCtx->mutex); > > There is a race condition with this code. Say during promotion > ShutDownSlotSync() is just before setting this flag and the user has > invoked pg_sync_replication_slots() and passed this check but still > didn't set the SlotSyncCtx->syncing flag. So, now, the promotion would > recognize that there is slot sync going on in parallel, and slot sync > wouldn't know that the promotion is in progress.
Did you mean that now, the promotion *would not* recognize... I see, I will fix this. > The current coding for slot sync worker ensures there is no such race > by checking the stopSignaled and setting the pid together under > spinlock. I think we need to move the setting of the syncing flag > similarly. Once we do that probably checking SlotSyncCtx->syncing > should be sufficient in ShutDownSlotSync(). If we change the location > of setting the 'syncing' flag then please ensure its cleanup as we > currently do in slotsync_failure_callback(). Sure, let me review. > 2. > @@ -1395,6 +1395,7 @@ update_synced_slots_inactive_since(void) > if (s->in_use && s->data.synced) > { > Assert(SlotIsLogical(s)); > + Assert(s->active_pid == 0); > > We can add a comment like: "The slot must not be acquired by any process" > > -- > With Regards, > Amit Kapila.