On Monday, February 2, 2026 5:10 PM shveta malik <[email protected]> wrote:
> 
> On Mon, Feb 2, 2026 at 12:56 PM Zhijie Hou (Fujitsu)
> <[email protected]> wrote:
> >
> >
> > Here are the updated patches.
> >
> 
> Thanks for the patch. Few trivial comments:
> 
> 1)
> + * and if the slots are not ready to be synced because of any of the
> + reasons
> + * mentioned above, then the SQL function also waits and retries
> until the slots
> + * are synchronized to the latest information. Refer to the comments
> + * in SyncReplicationSlots() for more details.
> 
> We can make it slightly more clear by mentioning that it waits only for the
> slots which existed at the start of function:
> 
> "...then the SQL function also waits and retries until the failover slots that
> existed on primary at the start of the function call are synchronized."

Improved.

> 
> 2)
> 
> We have below comment atop SyncReplicationSlots:
> 
>  * Repeatedly fetches and updates replication slot information from the
>  * primary until all slots are at least "sync ready".
> 
> We shall change this too, as now we are not only waiting for them to be sync-
> ready. Even if they are sync-ready, this function can still wait in subsequent
> runs for different reasons.

Right, fixed.

> 
> 3)
> 
> Existing test in test file:
> ##################################################
> # Test that pg_sync_replication_slots() on the standby skips and retries # 
> until
> the slot becomes sync-ready (when the remote slot catches up with # the
> locally reserved position).
> # Also verify that slotsync skip statistics are correctly updated when the #
> slotsync operation is skipped.
> ##################################################
> 
> New one added says:
> +##################################################
> +# Test that when physical replication lags behind logical replication,
> +# pg_sync_replication_slots() on the standby skips and retries until
> +physical # replication catches up. Also verify that slotsync skip
> +statistics are # correctly updated when the slotsync operation is skipped.
> +##################################################
> 
> The "need" of this new test case isn't very clear provided we already have
> testcase1. Perhaps we could revise the comment to something like:
> 
> "Test that even for a sync-ready slot, when physical replication lags behind
> logical replication, pg_sync_replication_slots() on the standby skips........"

Adjusted the comments as suggested.

In addition to addressing the comments, I revisited the recently updated
slotsync code and noticed opportunities to simplify some parameters, checks, and
codes. This will also facilitate the improvement in v2-0001 coding.

* Previously, certain function parameters(found_consistent_snapshot,
remote_slot_precedes of update_local_synced_slot()) were used to store the
reason for slot synchronization being skipped. However, now that a slot property
serves this purpose, we can simplify the code by eliminating those redundant
parameters and using the slot's property to perform the same check.

* The slot synchronization is skipped if the required WAL has not been received
and flushed. Previously, this check[1] was performed in two separate code paths.
Such duplication can lead to coding errors if changes are made in one location
without updating the other, as exemplified by the issue fixed in commit 3df4df5.
This commit consolidates the check into a single location to eliminate
redundancies and reduce the potential for future errors.

To address these points, I have created two additional patches: V3-0001 for the
first point and V3-0002 for the second. V3-0003 contains the current improvement
being discussed, and it's also simplified thanks to the preceding patches.


[1]
                /*
                 * ...
                 */
                if (remote_slot->confirmed_lsn > latestFlushPtr)
                {
                        update_slotsync_skip_stats(SS_SKIP_WAL_NOT_FLUSHED);

                        /*
                         * Can get here only if GUC 
'synchronized_standby_slots' on the
                         * primary server was not configured correctly.
                         */
                        ereport(AmLogicalSlotSyncWorkerProcess() ? LOG : ERROR,
                                        ...));
                }


Best Regards,
Hou zj


Attachment: v3-0004-Add-a-taptest.patch
Description: v3-0004-Add-a-taptest.patch

Attachment: v3-0001-Refactoring-remove-some-unnecessary-func-paramete.patch
Description: v3-0001-Refactoring-remove-some-unnecessary-func-paramete.patch

Attachment: v3-0002-Refactoring-move-similar-checks-to-a-central-plac.patch
Description: v3-0002-Refactoring-move-similar-checks-to-a-central-plac.patch

Attachment: v3-0003-Improve-the-retry-logic-in-pg_sync_replication_sl.patch
Description: v3-0003-Improve-the-retry-logic-in-pg_sync_replication_sl.patch

Reply via email to