On Mon, Jan 19, 2026 at 10:38 PM shveta malik <[email protected]> wrote: > > On Wed, Jan 14, 2026 at 11:24 PM Masahiko Sawada <[email protected]> > wrote: > > > > I've attached the updated patch. > > > > Thank You for the patch. I like the idea of optimization. Few initial > comments:
Thank you for reviewing the patch!
>
> 1)
> + * The query returns the slot names and their caught-up status in
> + * the same order as the results collected by
> + * get_old_cluster_logical_slot_infos(). If this query is changed,
>
> I could not find the function get_old_cluster_logical_slot_infos(), do
> you mean get_old_cluster_logical_slot_infos_query()?
It seems an oversight in commit 6d3d2e8e541f0. I think it should be
get_db_rel_and_slot_infos().
>
> 2)
> " WHERE database = current_database() AND "
> " slot_type = 'logical' AND "
>
> Is there a reason why database = current_database() is placed before
> slot_type = 'logical'? I am not sure how the PostgreSQL optimizer and
> executor will order these predicates, but from the first look,
> slot_type = 'logical' appears cheaper and could be placed first,
> consistent with the ordering used at other places.
Changed.
>
> 3)
> Shouldn’t we add a sanity check inside
> get_old_cluster_logical_slot_infos_query() to ensure that when
> skip_caught_up_check is true, we are on PostgreSQL 18 or lower? This
> would make the function safer for future use if it's called elsewhere.
> I understand the caller already performs a similar check, but I think
> it's more appropriate here since we call
> binary_upgrade_logical_slot_has_caught_up() from inside, which doesn’t
> even exist on newer versions.
What kind of sanity check did you mean? We can have a check with
pg_fatal() but it seems almost the same to me even if pg_upgrade fails
with an error due to missing
binary_upgrade_logical_slot_has_caught_up().
>
> 4)
> +# Check the file content. While both test_slot1 and test_slot2 should
> be reporting
> +# that they have unconsumed WAL records, test_slot3 should not be reported as
> +# it has caught up.
>
> Can you please elaborate the reason behind test_slot3 not being
> reported? Also mention in the comment if possible.
We advance test_slot3 to the current WAL LSN before executing
pg_upgrade, so the test_slot3 should have consumed all pending WALs.
Please refer to the following changes:
# Preparations for the subsequent test:
-# 1. Generate extra WAL records. At this point neither test_slot1 nor
-# test_slot2 has consumed them.
+# 1. Generate extra WAL records. At this point none of slots has consumed them.
#
# 2. Advance the slot test_slot2 up to the current WAL location, but test_slot1
# still has unconsumed WAL records.
#
# 3. Emit a non-transactional message. This will cause test_slot2 to detect the
# unconsumed WAL record.
+#
+# 4. Advance the slot test_slots3 up to the current WAL location.
$oldpub->start;
$oldpub->safe_psql(
'postgres', qq[
CREATE TABLE tbl AS SELECT generate_series(1, 10) AS a;
SELECT pg_replication_slot_advance('test_slot2', pg_current_wal_lsn());
- SELECT count(*) FROM pg_logical_emit_message('false',
'prefix', 'This is a non-transactional message');
+ SELECT count(*) FROM pg_logical_emit_message('false',
'prefix', 'This is a non-transactional message', true);
+ SELECT pg_replication_slot_advance('test_slot3', pg_current_wal_lsn());
I believe that the following new comment explains the reason well:
+# Check the file content. While both test_slot1 and test_slot2 should
be reporting
+# that they have unconsumed WAL records, test_slot3 should not be reported as
+# it has caught up.
I've attached the updated patch.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
v6-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch
Description: Binary data
