On Thu, Aug 28, 2025 at 3:29 PM Kirill Reshke <reshkekir...@gmail.com> wrote: > > On Thu, 28 Aug 2025 at 14:56, Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Thu, Aug 28, 2025 at 11:07 AM Ashutosh Sharma <ashu.coe...@gmail.com> > > wrote: > > > > > > We have seen cases where slot synchronization gets delayed, for example > > > when the slot is behind the failover standby or vice versa, and the slot > > > sync worker has to wait for one to catch up with the other. During this > > > waiting period, users querying pg_replication_slots can only see whether > > > the slot has been synchronized or not. If it has already synchronized, > > > that’s fine, but if synchronization is taking longer, users would > > > naturally want to understand the reason for the delay. > > > > > > Is there a way for end users to know the cause of slot synchronization > > > delays, so they can take appropriate actions to speed it up? > > > > > > I understand that server logs are emitted in such cases, but logs are not > > > something end users would want to check regularly. Moreover, since > > > logging is configuration-based, relevant messages may sometimes be > > > skipped or suppressed. > > > > > > > Currently, the way to see the reason for sync skip is LOGs but I think > > it is better to add a new column like sync_skip_reason in > > pg_replication_slots. This can show the reasons like > > standby_LSN_ahead_remote_LSN. I think ideally users can compare > > standby's slot LSN/XMIN with remote_slot being synced. Do you have any > > better ideas? > > > > How about something like pg_stat_progress_replication_slot with remote > LSN/standby LSN/catalog XID etc? > Wouldn't this be in sync with all other debug pg_stat_progress* views > and thus more Postgres-y? >
Yes, that is another option. I am a little worried that it is not always the sync lags behind, so having a separate view just for sync progress may be too much. Yet another option is existing view pg_stat_replication_slots but it seems sync progress doesn't directly match there. For example, we can add a counter sync_skipped, time of last sync_skip, and last_sync_skip_reason that could be sufficient to dig the problem further. -- With Regards, Amit Kapila.