On Fri, Mar 8, 2024 at 10:42 PM Bharath Rupireddy <bharath.rupireddyforpostg...@gmail.com> wrote: > > On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > You might want to consider its interaction with sync slots on standby. > > Say, there is no activity on slots in terms of processing the changes > > for slots. Now, we won't perform sync of such slots on standby showing > > them inactive as per your new criteria where as same slots could still > > be valid on primary as the walsender is still active. This may be more > > of a theoretical point as in running system there will probably be > > some activity but I think this needs some thougths. > > I believe the xmin and catalog_xmin of the sync slots on the standby > keep advancing depending on the slots on the primary, no? If yes, the > XID age based invalidation shouldn't be a problem.
If the user has not enabled slot-sync worker and is relying on the SQL function pg_sync_replication_slots(), then the xmin and catalog_xmin of synced slots may not keep on advancing. These will be advanced only on next run of function. But meanwhile the synced slots may be invalidated due to 'xid_aged'. Then the next time, when user runs pg_sync_replication_slots() again, the invalidated slots will be dropped and will be recreated by this SQL function (provided they are valid on primary and are invalidated on standby alone). I am not stating that it is a problem, but we need to think if this is what we want. Secondly, the behaviour is not same with 'inactive_timeout' invalidation. Synced slots are immune to 'inactive_timeout' invalidation as this invalidation happens only in walsender, while these are not immune to 'xid_aged' invalidation. So again, needs some thoughts here. > I believe there are no walsenders started for the sync slots on the > standbys, right? If yes, the inactive timeout based invalidation also > shouldn't be a problem. Because, the inactive timeouts for a slot are > tracked only for walsenders because they are the ones that typically > hold replication slots for longer durations and for real replication > use. We did a similar thing in a recent commit [1]. > > Is my understanding right? Do you still see any problems with it? I have explained the situation above for us to think over it better. thanks Shveta