On Fri, Feb 14, 2025 at 5:30 PM Nisha Moond <nisha.moond...@gmail.com> wrote: > > Here is a summary of changes in v78: >
A few minor comments: 1. Slots that appear idle due to a disrupted connection between + the publisher and subscriber are also excluded, as they are managed by + <link linkend="guc-wal-sender-timeout"><varname>wal_sender_timeout</varname></link>. ... How do we exclude the above kind of slots? I think it is trying to cover the case where walsender is not exited even after the connection is broken between publisher and subscriber. The point is quite confusing and adds much less value. So, we can remove it. 2. - * Returns true when any slot have got invalidated. + * Returns true if there are any invalidated slots. ... I find the existing comment more suitable for this function and easy to follow. Apart from the above, I have changed a few other comments and minor cosmetic cleanup. -- With Regards, Amit Kapila.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index 4ff66f0dd5..476b8e5355 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -606,6 +606,11 @@ retry: if (!nowait) ConditionVariablePrepareToSleep(&s->active_cv); + /* + * It is important to reset the inactive_since under spinlock here to + * avoid race conditions with slot invalidation. See comments related + * to inactive_since in InvalidatePossiblyObsoleteSlot. + */ SpinLockAcquire(&s->mutex); if (s->active_pid == 0) s->active_pid = MyProcPid; @@ -1641,11 +1646,10 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s, if (possible_causes & RS_INVAL_HORIZON) { - if (SlotIsLogical(s) && /* invalid DB oid signals a shared relation */ + if (SlotIsLogical(s) && (dboid == InvalidOid || dboid == s->data.database)) { - if (TransactionIdIsValid(initial_effective_xmin) && TransactionIdPrecedesOrEquals(initial_effective_xmin, snapshotConflictHorizon)) @@ -1757,11 +1761,13 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes, * The slot's mutex will be released soon, and it is possible that * those values change since the process holding the slot has been * terminated (if any), so record them here to ensure that we - * would report the correct invalidation cause. No need to record - * inactive_since for the idle_timeout case here, as an already - * inactive slot's inactive_since can only be reset under a mutex - * in ReplicationSlotAcquire(), and an inactive slot can be - * invalidated immediately without releasing the spinlock. + * would report the correct invalidation cause. + * + * Unlike others slot's inactive_since can't be changed once it is + * acquired till it gets released or the process owning it gets + * terminated. The slot remains active till some process owns it. + * So, the inactive slot can only be invalidated immediately + * without being terminated. */ if (!terminated) {