Re: pg_upgrade: optimize replication slot caught-up check

Masahiko Sawada Wed, 14 Jan 2026 09:54:58 -0800

On Sun, Jan 11, 2026 at 7:31 PM Chao Li <[email protected]> wrote:
>
>
>
> > On Jan 10, 2026, at 05:32, Masahiko Sawada <[email protected]> wrote:
> >
> > On Fri, Jan 9, 2026 at 10:03 AM Masahiko Sawada <[email protected]> 
> > wrote:
> >>
> >> On Thu, Jan 8, 2026 at 6:59 PM Chao Li <[email protected]> wrote:
> >>>
> >>>
> >>>
> >>> I just looked into v3. Basically, it now does a shared WAL scan to find 
> >>> the newest decodable LSN and uses that to compare with all slots’ 
> >>> confirmed_flush_lsn, which significantly reduces WAL scan effort when 
> >>> there are many slots.
> >>
> >> Thank you for reviewing the patch!
> >>
> >>>
> >>> One thing I'm thinking about is that if all slots are far behind, the 
> >>> shared scan may still take a long time. Before this change, a scan could 
> >>> stop as soon as it found a pending WAL. So after the change, when there 
> >>> are only a few slots and they are far behind, the scan might end up doing 
> >>> more work than before.
> >>
> >> That's a valid concern.
> >>
> >>> As a possible optimization, maybe we could also pass the newest 
> >>> confirmed_flush_lsn to the scan. Once it finds a decodable WAL record 
> >>> newer than that confirmed_flush_lsn, we already know all slots are 
> >>> behind, so the scan could stop at that point.
> >>
> >> Sounds like a reasonable idea. I'll give it a try and see how it's 
> >> worthwhile.
> >>
> >>>
> >>> WRT the code change, I got a few comments:
> >>>
> >>> 1
> >>> ···
> >>> + * otherwise false. If last_pending_wal_p is set by the caller, it 
> >>> continues
> >>> + * scanning WAL even after detecting a decodable WAL record, in order to
> >>> + * get the last decodable WAL record's LSN.
> >>>  */
> >>> bool
> >>> -LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal)
> >>> +LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal,
> >>> +                                                                       
> >>> XLogRecPtr *last_pending_wal_p)
> >>> {
> >>>        bool            has_pending_wal = false;
> >>>
> >>>        Assert(MyReplicationSlot);
> >>>
> >>> +       if (last_pending_wal_p != NULL)
> >>> +               *last_pending_wal_p = InvalidXLogRecPtr;
> >>> ···
> >>>
> >>> The header comment seems to conflict to the code. last_pending_wal_p is 
> >>> unconditionally set to InvalidXLogRecPtr, so whatever a caller set is 
> >>> overwritten. I think you meant to say “if last_pending_wal_p is not NULL”.
> >>
> >> Agreed.
> >>
> >>>
> >>> 2
> >>> ```
> >>> @@ -286,9 +287,9 @@ 
> >>> binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
> >>> {
> >>>        Name            slot_name;
> >>>        XLogRecPtr      end_of_wal;
> >>> -       bool            found_pending_wal;
> >>> +       XLogRecPtr      last_pending_wal;
> >>> ```
> >>>
> >>> The function header comment still says “returns true if …”, that should 
> >>> be updated.
> >>>
> >>> Also, with the change, the function name becomes misleading, where “has” 
> >>> implies a boolean result, but now it will return the newest docodeable 
> >>> wal when no catching up. The function name doesn’t reflect to the actual 
> >>> behavior. Looks like the function is only used by pg_upgrade, so maybe 
> >>> rename it.
> >>
> >> Agreed, I'll incorporate the comment in the next version patch.
> >>
> >
> > I've attached the updated patch that addressed all review comments.
> >
> > Regards,
> >
> > --
> > Masahiko Sawada
> > Amazon Web Services: https://aws.amazon.com
> > <v4-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>
>
> A few comments on v4:
>
> 1
> ```
> -        * slot is considered caught up is done by an upgrade function. This
> -        * regards the slot as caught up if we don't find any decodable 
> changes.
> -        * See binary_upgrade_logical_slot_has_caught_up().
> +        * slot is considered caught up is done by an upgrade function, 
> unless the
> +        * caller sets skip_caught_up_check. This regards the slot as caught 
> up if
> +        * we don't find any decodable changes. See
> +        * binary_upgrade_logical_slot_has_caught_up().
> ```
>
> binary_upgrade_logical_slot_has_caught_up has been renamed, so this commend 
> needs to be updated.
>
> 2
> ```
> +                                       "temporary IS FALSE "
> +                                       "ORDER BY 1;",
> +                                       (skip_caught_up_check || 
> user_opts.live_check) ? "FALSE" :
>                                         "(CASE WHEN invalidation_reason IS 
> NOT NULL THEN FALSE "
>                                         "ELSE (SELECT 
> pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
>                                         "END)");
> ```
>
> pg_catalog.binary_upgrade_logical_slot_has_caught_up has been renamed and it 
> takes two parameters now.


binary_upgrade_logical_slot_has_caught_up() is still used when the old
cluster is PG18 or older, so we cannot use
binary_upgrade_check_logical_slot_pending_wal() here.

>
>
> 3
> ```
> +                               if (last_pending_wal > scan_cutoff_lsn)
> +                                       break;
> ```
>
> In LogicalReplicationSlotHasPendingWal() we early break when last_pending_wal 
> > scan_cutoff_lsn, and later the SQL check “confirmed_flush_lsn > 
> last_pending_wal”. So there is an edge case, where last_pending_wal == 
> scan_cutoff_lsn and confirmed_flush_lsn == last_pending_wal, then neither 
> early break nor caught up happens.
>
> So, I think we should use “>=“ for the both checks.

Good catch, but I think we should use ">=" only in
LogicalReplicationSlotHasPendingWal(). We can terminate the scan early
if we find any decodable WAL whose LSN is >= confirmed_flush_lsn. If
scan_cutoff (refers to a slot's confirmed_flush_lsn) ==
last_pending_wal, the slot should not ignore the last_pending_wal.

I've attached the updated patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

v5-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch
Description: Binary data

Re: pg_upgrade: optimize replication slot caught-up check

Reply via email to