On Wed, Jun 11, 2025 at 7:19 AM shveta malik <shveta.ma...@gmail.com> wrote:
>
> On Tue, Jun 10, 2025 at 3:20 PM Zhijie Hou (Fujitsu)
> <houzj.f...@fujitsu.com> wrote:
> >
> >
> > Thanks for updating the patch.
> >
> > I have few suggestions for the document from a user's perspective.
> >
>
> Thanks Hou-San, I agree with your suggestions. Addressed in v4.
>
> Also addressed Amit's suggestion at [1] to improve errdetail.
>

So, the overall direction we are taking here is that we want to
improve the existing LOG/DEBUG messages and docs for HEAD and back
branches. Then we will improve the API behavior based on Hou-San's
patch for PG19. Let me know if you or others think otherwise.

+    <para>
+     Apart from enabling <link linkend="guc-sync-replication-slots">
+     <varname>sync_replication_slots</varname></link> to synchronize slots
+     periodically, failover slots can be manually synchronized by invoking
+     <link linkend="pg-sync-replication-slots">
+     <function>pg_sync_replication_slots</function></link> on the standby.
+     However, this function is primarily intended for testing and debugging
+     purposes and should be used with caution. The recommended approach to
+     synchronize slots is by enabling <link
linkend="guc-sync-replication-slots">
+     <varname>sync_replication_slots</varname></link> on the standby, as it
+     ensures continuous and automatic synchronization of replication slots,
+     facilitating seamless failover and high availability.
+    </para>
+
+    <para>
+     When slot-synchronization setup is done as recommended, and
+     slot-synchronization is performed the very first time either automatically
+     or by <link linkend="pg-sync-replication-slots">
+     <function>pg_sync_replication_slots</function></link>,
+     then for the synchronized slot to be created and persisted on the standby,
+     one condition must be met. The logical replication slot on the primary
+     must reach a state where the WALs and system catalog rows retained by
+     the slot are also present on the corresponding standby server. This is
+     needed to prevent any data loss and to allow logical replication
to continue
+     seamlessly through the synchronized slot if needed after promotion.
+     If the WALs and system catalog rows retained by the slot on the
primary have
+     already been purged from the standby server, and synchronization
is attempted
+     for the first time, then to prevent the data loss as explained,
persistence
+     and synchronization of newly created slot will be skipped, and
the following
+     log message may appear on standby.
+<programlisting>
+     LOG:  could not synchronize replication slot "failover_slot"
+     DETAIL:  Synchronization could lead to data loss as the remote
slot needs WAL at LSN 0/3003F28 and catalog xmin 754, but the standby
has LSN 0/3003F28 and catalog xmin 756
+</programlisting>
+     If the logical replication slot is actively consumed by a
consumer, no further
+     manual action is needed by the user, as the slot on primary will
be advanced
+     automatically, and synchronization will proceed in the next
cycle. However,
+     if no logical replication consumer is set up yet, to advance the slot, it
+     is recommended to manually run the <link
linkend="pg-logical-slot-get-changes">
+     <function>pg_logical_slot_get_changes</function></link> or
+     <link linkend="pg-logical-slot-get-binary-changes">
+     <function>pg_logical_slot_get_binary_changes</function></link>
on the primary
+     slot and allow synchronization to proceed.
+    </para>
+

I have reworded the above as follows:
To enable periodic synchronization of replication slots, it is
recommended to activate sync_replication_slots on the standby server.
While manual synchronization is possible using
pg_sync_replication_slots, this function is primarily intended for
testing and debugging and should be used with caution. Automatic
synchronization via sync_replication_slots ensures continuous slot
updates, supporting seamless failover and maintaining high
availability. When slot synchronization is configured as recommended,
and the initial synchronization is performed either automatically or
manually via pg_sync_replication_slot, the standby can persist the
synchronized slot only if the following condition is met: The logical
replication slot on the primary must retain WALs and system catalog
rows that are still available on the standby. This ensures data
integrity and allows logical replication to continue smoothly after
promotion.
If the required WALs or catalog rows have already been purged from the
standby, the slot will not be persisted to avoid data loss. In such
cases, the following log message may appear:

LOG: could not synchronize replication slot "failover_slot"
DETAIL: Synchronization could lead to data loss as the remote slot
needs WAL at LSN 0/3003F28 and catalog xmin 754, but the standby has
LSN 0/3003F28 and catalog xmin 756

If the logical replication slot is actively used by a consumer, no
manual intervention is needed; the slot will advance automatically,
and synchronization will resume in the next cycle. However, if no
consumer is configured, it is advisable to manually advance the slot
on the primary using pg_logical_slot_get_changes or
pg_logical_slot_get_binary_changes, allowing synchronization to
proceed.

Let me know what you think of above?

-- 
With Regards,
Amit Kapila.


Reply via email to