On Wed, Nov 26, 2025 at 9:30 AM Shlok Kyal <[email protected]> wrote:
>
> On Wed, 26 Nov 2025 at 09:00, Hayato Kuroda (Fujitsu)
> <[email protected]> wrote:
> >
> > Dear Hou,
> >
> > > I think we did not sync slots to standby2, so I removed the checks for 
> > > that.
> > >
> > > I also adjusted the test in a way that it cleans up existing slots before 
> > > starting
> > > new tests.
> >
> > Thanks for updating the patch. I confirmed on my env that your patch could 
> > be
> > applied cleanly and tests were passed. Pgperltidy say nothing for your 
> > patch.
> > Also, I preferred the current style.
> >
> > I think it is worth checking on BF to see how they say.
> >
>
> Thanks Amit for pushing the 0001 patch.
> Thanks Hou-san and Kuroda-san on fixing the test.
>
> I have rebased the 0002 patch on the current HEAD.
>

Thanks. Please find a few comments:

1)

+       The reason for the last slot synchronization skip. This field
is set only
+       for logical slots that are being synchronized from a primary
server (that
+       is, those whose <structfield>synced</structfield> field is
+       <literal>true</literal>). The value of this column has no meaning on the
+       primary server; it defaults to <literal>none</literal> for all
slots, but
+       may (if leftover from a promoted standby) also have a value other than
+       <literal>none</literal>. Possible values are:

We can make this similar to existing fields (slotsync_skip_count and
slotsync_skip_at):

Slot synchronization occurs only on standby servers and thus this column has
no meaning on the primary server.

2)
Doc on possible values of slotsync_skip_reason can be improved. Example

+          <literal>remote_behind</literal> means that the last slot
+          synchronization was skipped because the slot is ahead of the
+          corresponding failover slot on the primary.

We can get rid of 'last slot synchronization was skipped' from all the
reasons. (See 'invalidation_reason' possible values for reference, it
does not mention 'was invalidated' in any).

3)
+          <literal>wal_not_flushed</literal> means that the last slot
+          synchronization was skipped because the standby had not flushed the
+          WAL corresponding to the confirmed flush position on the remote slot.

I am not sure if we need to mention 'confirmed flush position'. Shall we say:

'.....because the standby had not flushed the WAL corresponding to the
position reserved on the remote slot'.
Thoughts?

4)
+          <literal>none</literal> means that the last slot synchronization
+          completed successfully.

Do we even need to mention 'none' in doc? 'invalidation_reason' does
not mention it.

5)
postgres=# select slot_name, invalidation_reason, slotsync_skip_reason
from pg_replication_slots;
   slot_name    | invalidation_reason | slotsync_skip_reason
----------------+---------------------+----------------------
 failover_slot2 |                     | none

Shall we keep 'slotsync_skip_reason' as NULL instead of 'none' similar
to invalidation_reason. Thoughts?

6)
If we plan to keep slotsync_skip_reason as NULL instead of 'none' for
non-skipped cases (above comment), then below code can be optimised as
we do not need to update 'none' as stats.
'skip_reason' and last update() call can then be removed and we can
simply call update_slotsync_skip_stats() instead of 'skip_reason =
SS_SKIP_NO_CONSISTENT_SNAPSHOT'.

update_local_synced_slot():

+ SlotSyncSkipReason skip_reason = SS_SKIP_NONE;
+ update_slotsync_skip_stats(SS_SKIP_REMOTE_BEHIND);
+ skip_reason = SS_SKIP_NO_CONSISTENT_SNAPSHOT;
+ /* Update slot sync skip stats */
+ update_slotsync_skip_stats(skip_reason);

7)

+static char *
+GetSlotSyncSkipReason(SlotSyncSkipReason reason)

We are passing 'SlotSyncSkipReason' and function name says
'GetSlotSyncSkipReason', looks confusing.
Shall we rename function name to GetSlotSyncSkipString or
GetSlotSyncSkipReasonName (similar to GetSlotInvalidationCauseName)

8)

+ * A slotsync skip typically occurs only for temporary slots. For
+ * persistent slots it is extremely rare (e.g., cases like
+ * SS_SKIP_WAL_NOT_FLUSHED or SS_SKIP_REMOTE_BEHIND). Also, temporary
+ * slots are dropped after server restart, so there is no value in
+ * persisting the slotsync_skip_reason.
+ */
+ SlotSyncSkipReason slotsync_skip_reason;

I feel, 'Also-->Since' will make more sense here.


9)
In doc for slotsync_skip_count and slotsync_skip_at

+ Slot
+        synchronization occur only on standby servers and thus this column has
+        no meaning on the primary server.

occur --> occurs

thanks
Shveta


Reply via email to