On Wed, Apr 3, 2024 at 9:57 PM Bertrand Drouvot
<bertranddrouvot...@gmail.com> wrote:
>
> > > shouldn't the slot be dropped/recreated instead of updating 
> > > inactive_since?
> >
> > The sync slots that are invalidated on the primary aren't dropped and
> > recreated on the standby.
>
> Yeah, right (I was confused with synced slots that are invalidated locally).
>
> > However, I
> > found that the synced slot is acquired and released unnecessarily
> > after the invalidation_reason is synced from the primary. I added a
> > skip check in synchronize_one_slot to skip acquiring and releasing the
> > slot if it's locally found inactive. With this, inactive_since won't
> > get updated for invalidated sync slots on the standby as we don't
> > acquire and release the slot.
>
> CR1 ===
>
> Yeah, I can see:
>
> @@ -575,6 +575,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid 
> remote_dbid)
>                                                    " name slot \"%s\" already 
> exists on the standby",
>                                                    remote_slot->name));
>
> +               /*
> +                * Skip the sync if the local slot is already invalidated. We 
> do this
> +                * beforehand to save on slot acquire and release.
> +                */
> +               if (slot->data.invalidated != RS_INVAL_NONE)
> +                       return false;
>
> Thanks to the drop_local_obsolete_slots() call I think we are not missing the 
> case
> where the slot has been invalidated on the primary, invalidation reason has 
> been
> synced on the standby and later the slot is dropped/ recreated manually on the
> primary (then it should be dropped/recreated on the standby too).
>
> Also it seems we are not missing the case where a sync slot is invalidated
> locally due to wal removal (it should be dropped/recreated).

Right.

> > > CR5 ===
> > >
> > > +       /*
> > > +        * This function isn't expected to be called for inactive timeout 
> > > based
> > > +        * invalidation. A separate function 
> > > InvalidateInactiveReplicationSlot is
> > > +        * to be used for that.
> > >
> > > Do you think it's worth to explain why?
> >
> > Hm, I just wanted to point out the actual function here. I modified it
> > to something like the following, if others feel we don't need that, I
> > can remove it.
>
> Sorry If I was not clear but I meant to say "Do you think it's worth to 
> explain
> why we decided to create a dedicated function"? (currently we "just" explain 
> that
> we created one).

We added a new function (InvalidateInactiveReplicationSlot) to
invalidate slot based on inactive timeout because 1) we do the
inactive timeout invalidation at slot level as opposed to
InvalidateObsoleteReplicationSlots which does loop over all the slots,
2)
InvalidatePossiblyObsoleteSlot does release the lock in some cases,
has a lot of unneeded code for inactive timeout invalidation check, 3)
we want some control over saving the slot to disk because we hook the
inactive timeout invalidation into the loop that checkpoints the slot
info to the disk in CheckPointReplicationSlots.

I've added a comment atop InvalidateInactiveReplicationSlot.

Please find the attached v36 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment: v36-0001-Add-inactive_timeout-based-replication-slot-inva.patch
Description: Binary data

Reply via email to