On Wed, 26 Nov 2025 at 10:25, shveta malik <[email protected]> wrote:
>
> On Wed, Nov 26, 2025 at 9:30 AM Shlok Kyal <[email protected]> wrote:
> >
> > On Wed, 26 Nov 2025 at 09:00, Hayato Kuroda (Fujitsu)
> > <[email protected]> wrote:
> > >
> > > Dear Hou,
> > >
> > > > I think we did not sync slots to standby2, so I removed the checks for 
> > > > that.
> > > >
> > > > I also adjusted the test in a way that it cleans up existing slots 
> > > > before starting
> > > > new tests.
> > >
> > > Thanks for updating the patch. I confirmed on my env that your patch 
> > > could be
> > > applied cleanly and tests were passed. Pgperltidy say nothing for your 
> > > patch.
> > > Also, I preferred the current style.
> > >
> > > I think it is worth checking on BF to see how they say.
> > >
> >
> > Thanks Amit for pushing the 0001 patch.
> > Thanks Hou-san and Kuroda-san on fixing the test.
> >
> > I have rebased the 0002 patch on the current HEAD.
> >
>
> Thanks. Please find a few comments:
>
> 1)
>
> +       The reason for the last slot synchronization skip. This field
> is set only
> +       for logical slots that are being synchronized from a primary
> server (that
> +       is, those whose <structfield>synced</structfield> field is
> +       <literal>true</literal>). The value of this column has no meaning on 
> the
> +       primary server; it defaults to <literal>none</literal> for all
> slots, but
> +       may (if leftover from a promoted standby) also have a value other than
> +       <literal>none</literal>. Possible values are:
>
> We can make this similar to existing fields (slotsync_skip_count and
> slotsync_skip_at):
>
> Slot synchronization occurs only on standby servers and thus this column has
> no meaning on the primary server.
>
> 2)
> Doc on possible values of slotsync_skip_reason can be improved. Example
>
> +          <literal>remote_behind</literal> means that the last slot
> +          synchronization was skipped because the slot is ahead of the
> +          corresponding failover slot on the primary.
>
> We can get rid of 'last slot synchronization was skipped' from all the
> reasons. (See 'invalidation_reason' possible values for reference, it
> does not mention 'was invalidated' in any).
>
> 3)
> +          <literal>wal_not_flushed</literal> means that the last slot
> +          synchronization was skipped because the standby had not flushed the
> +          WAL corresponding to the confirmed flush position on the remote 
> slot.
>
> I am not sure if we need to mention 'confirmed flush position'. Shall we say:
>
> '.....because the standby had not flushed the WAL corresponding to the
> position reserved on the remote slot'.
> Thoughts?
>
I think the suggested wording would be more clear to understand for
the user. Added the change.

> 4)
> +          <literal>none</literal> means that the last slot synchronization
> +          completed successfully.
>
> Do we even need to mention 'none' in doc? 'invalidation_reason' does
> not mention it.
>
> 5)
> postgres=# select slot_name, invalidation_reason, slotsync_skip_reason
> from pg_replication_slots;
>    slot_name    | invalidation_reason | slotsync_skip_reason
> ----------------+---------------------+----------------------
>  failover_slot2 |                     | none
>
> Shall we keep 'slotsync_skip_reason' as NULL instead of 'none' similar
> to invalidation_reason. Thoughts?
>
I agree with your suggestion. I have removed the 'none' value and used
NULL instead.

> 6)
> If we plan to keep slotsync_skip_reason as NULL instead of 'none' for
> non-skipped cases (above comment), then below code can be optimised as
> we do not need to update 'none' as stats.
> 'skip_reason' and last update() call can then be removed and we can
> simply call update_slotsync_skip_stats() instead of 'skip_reason =
> SS_SKIP_NO_CONSISTENT_SNAPSHOT'.
>
> update_local_synced_slot():
>
> + SlotSyncSkipReason skip_reason = SS_SKIP_NONE;
> + update_slotsync_skip_stats(SS_SKIP_REMOTE_BEHIND);
> + skip_reason = SS_SKIP_NO_CONSISTENT_SNAPSHOT;
> + /* Update slot sync skip stats */
> + update_slotsync_skip_stats(skip_reason);
>
I think we need this change even if we use NULL instead of 'none'.
This change ensures that the slot sync reason is set to NULL if slot
sync is successful.

> 7)
>
> +static char *
> +GetSlotSyncSkipReason(SlotSyncSkipReason reason)
>
> We are passing 'SlotSyncSkipReason' and function name says
> 'GetSlotSyncSkipReason', looks confusing.
> Shall we rename function name to GetSlotSyncSkipString or
> GetSlotSyncSkipReasonName (similar to GetSlotInvalidationCauseName)
>
> 8)
>
> + * A slotsync skip typically occurs only for temporary slots. For
> + * persistent slots it is extremely rare (e.g., cases like
> + * SS_SKIP_WAL_NOT_FLUSHED or SS_SKIP_REMOTE_BEHIND). Also, temporary
> + * slots are dropped after server restart, so there is no value in
> + * persisting the slotsync_skip_reason.
> + */
> + SlotSyncSkipReason slotsync_skip_reason;
>
> I feel, 'Also-->Since' will make more sense here.
>
>
> 9)
> In doc for slotsync_skip_count and slotsync_skip_at
>
> + Slot
> +        synchronization occur only on standby servers and thus this column 
> has
> +        no meaning on the primary server.
>
> occur --> occurs

I have also addressed the remaining comments and attached the updated patch.

Thanks,
Shlok Kyal

Attachment: v13-0001-Add-slotsync_skip_reason-to-pg_replication_slots.patch
Description: Binary data

Reply via email to