On Mon, Jun 9, 2025 at 7:09 PM Vitaly Davydov <v.davy...@postgrespro.ru> wrote: > > I think we can use this approach for HEAD and probably keep the > > previous idea for backbranches. Keeping some value in shared_memory > > per slot sounds risky to me in terms of introducing new bugs. > > Not sure, what kind of problems may occur. I propose to allocate in shmem an > array of last_saved_restart_lsn like below which is not a part of the public > api (see below). It will be allocated and deallocated in shmem the same way as > ReplicationSlotCtlData. I can prepare a patch, if needed. > > typedef struct ReplicationSlotCtlDataExt { > XLogRecPtr last_saved_restart_lsn[1]; > } ReplicationSlotCtlDataExt;
This could work, but I think this is not a solution for HEAD anyway. In the HEAD, it would be better to keep everything inside the ReplicationSlot struct. In the same time, I don't like idea to have different shared memory structs between branches if we can avoid that. > > Yeah, but with physical slots, it is possible that the slot's xmin > > value is pointing to some value, say 700 (after restart), but vacuum > > would have removed tuples from transaction IDs greater than 700 as > > explained in email [1]. > > I think, we have no xmin problem for physical slots. The xmin values of > physical slots are used to process HSF messages. If I correctly understood > what > you mean, you are telling about the problem which is solved by hot standby > feedback messages. This message is used to disable tuples vacuuming on the > primary to avoid delete conflicts on the replica in queries (some queries may > select some tuples which were vacuumed on the primary and deletions are > replicated to the standby). If the primary receives a HSF message after slot > saving, I believe, it is allowable if autovacuum cleans tuples with xmin later > than the last saved value. If the primary restarts, the older value will be > loaded but the replica already confirmed the newer value. Concerning replica, > it is the obligation of the replica to send such HSF xmin that will survive > replica's immediate restart. +1 > >> Taking into account these thoughts, I can't see any problems with the > >> alternative > >> patch where oldest wal lsn is calculated only in checkpoint. > >> > >The alternative will needlessly prevent removing WAL segments in some > >cases when logical slots are in use. > > IMHO, I'm not sure, it will significantly impact the wal removal. We remove > WAL > segments only in checkpoint. The alternate solution gets the oldest WAL > segment > at the beginning of checkpoint, then saves dirty slots to disk, and removes > old > WAL segments at the end of checkpoint using the oldest WAL segment obtained at > the beginning of checkpoint. The alternate solution may not be so effective > in terms of WAL segments removal, if a logical slot is advanced during > checkpoint, but I do not think it is a significant issue. From the other hand, > the alternate solution simplifies the logic of WAL removal, backward > compatible > (avoids addition new in-memory states), decreases the number of locks in > ReplicationSlotsComputeRequiredLSN - no need to recalculate oldest slots' > restart lsn every time when a slot is advanced. So, my proposal is to commit the attached patchset to the HEAD, and commit [1] to the back branches. Any objections? Links. 1. https://www.postgresql.org/message-id/CAPpHfdutKQxpm-gJgiZRb2ouKC9%2BHZx3fG3F00zd%3DxdxDidm_g%40mail.gmail.com ------ Regards, Alexander Korotkov Supabase
v3-0001-Keep-WAL-segments-by-slot-s-flushed-restart-LSN.patch
Description: Binary data
v3-0002-Add-TAP-tests-to-check-replication-slot-advance-d.patch
Description: Binary data