Prevent invalidation of newly synced replication slots. A race condition could cause a newly synced replication slot to become invalidated between its initial sync and the checkpoint.
When syncing a replication slot to a standby, the slot's initial restart_lsn is taken from the publisher's remote_restart_lsn. Because slot sync happens asynchronously, this value can lag behind the standby's current redo pointer. Without any interlocking between WAL reservation and checkpoints, a checkpoint may remove WAL required by the newly synced slot, causing the slot to be invalidated. To fix this, we acquire ReplicationSlotAllocationLock before reserving WAL for a newly synced slot, similar to commit 006dd4b2e5. This ensures that if WAL reservation happens first, the checkpoint process must wait for slotsync to update the slot's restart_lsn before it computes the minimum required LSN. However, unlike in ReplicationSlotReserveWal(), this lock alone cannot protect a newly synced slot if a checkpoint has already run CheckPointReplicationSlots() before slotsync updates the slot. In such cases, the remote restart_lsn may be stale and earlier than the current redo pointer. To prevent relying on an outdated LSN, we use the oldest WAL location available if it is greater than the remote restart_lsn. This ensures that newly synced slots always start with a safe, non-stale restart_lsn and are not invalidated by concurrent checkpoints. Author: Zhijie Hou <[email protected]> Reviewed-by: Hayato Kuroda <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Vitaly Davydov <[email protected]> Reviewed-by: Chao Li <[email protected]> Backpatch-through: 17 Discussion: https://postgr.es/m/TY4PR01MB16907E744589B1AB2EE89A31F94D7A%40TY4PR01MB16907.jpnprd01.prod.outlook.com Branch ------ REL_18_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/919c9fa13cd0684b437a88719d670a9bf6dd0dc8 Modified Files -------------- src/backend/access/transam/xlog.c | 6 +- src/backend/replication/logical/slotsync.c | 99 +++++++++++----------- src/include/access/xlog.h | 1 + src/test/recovery/t/046_checkpoint_logical_slot.pl | 84 +++++++++++++++++- 4 files changed, 137 insertions(+), 53 deletions(-)
