On Monday, January 12, 2026 10:58 AM Amit Kapila <[email protected]> 
wrote: 
> Thanks. I pushed this last week.

Since the main patch has been committed, I am sharing the patch to fix another
similar bug in slot sync code path:

When initially syncing a replication slot to the standby server, the remote
restart_lsn queried from the publisher is used as the initial restart_lsn for
newly synced slots. As slots are synced asynchronously, the synced restart_lsn
may lag behind the standby's redo pointer. Due to the lack of interlock between
WAL reservation and checkpoints, this creates a race condition where
checkpoints might remove required WALs and invalidate the newly synced slots.

To address this issue, similar to commit 006dd4b, I think we can first acquire 
the
ReplicationSlotAllocationLock when reserving WAL position for the newly synced
slot. This ensures that if WAL reservation occurs first, the checkpoint process
is forced to wait for slotsync to update the newly synced slot's restart_lsn
before calculating the minimum LSN.

However, unlike in ReplicationSlotReserveWal(), this lock cannot protect a newly
synced slot if a concurrent checkpoint has invoked CheckPointReplicationSlots()
before the slotsync updates the slot restart_lsn. This is because in slotsync
case, the initial restart_lsn received from the remote server may be outdated,
preceding the redo pointer. Therefore, when selecting the initial restart_lsn,
this commit uses the redo pointer or the minimum slot LSN (if those values are
greater than the remote restart_lsn) instead of relying solely on the remote
value.

0001 contains a fix, and 0002 contains a test to reproduce the problem
using the injection point.

Note that there's a third issue[1] where a new slot may be invalidated if
ReplicationSlotsComputeRequiredLSN is invoked concurrently. I'll start a new
thread to address that.

[1] 
https://www.postgresql.org/message-id/TY4PR01MB16907DCA80DBC3E77CE6B203294C9A%40TY4PR01MB16907.jpnprd01.prod.outlook.com

Best Regards,
Hou zj

Attachment: v1-0002-Add-a-tap-test-using-injection-point.patch
Description: v1-0002-Add-a-tap-test-using-injection-point.patch

Attachment: v1-0001-Fix-race-conditions-causing-invalidation-of-newly.patch
Description: v1-0001-Fix-race-conditions-causing-invalidation-of-newly.patch

Reply via email to