On Tue, 19 Nov 2024 at 12:43, Nisha Moond <[email protected]> wrote:
>
> Attached is the v49 patch set:
> - Fixed the bug reported in [1].
> - Addressed comments in [2] and [3].
>
> I've split the patch into two, implementing the suggested idea in
> comment #5 of [2] separately in 001:
>
> Patch-001: Adds additional error reports (for all invalidation types)
> in ReplicationSlotAcquire() for invalid slots when error_if_invalid =
> true.
> Patch-002: The original patch with comments addressed.
This Assert can fail:
+ /*
+ * Check if the slot needs to
be invalidated due to
+ *
replication_slot_inactive_timeout GUC.
+ */
+ if (now &&
+
TimestampDifferenceExceeds(s->inactive_since, now,
+
replication_slot_inactive_timeout_sec *
1000))
+ {
+ invalidation_cause = cause;
+ inactive_since =
s->inactive_since;
+
+ /*
+ * Invalidation due to
inactive timeout implies that
+ * no one is using the slot.
+ */
+ Assert(s->active_pid == 0);
With the following scenario:
Set replication_slot_inactive_timeout to 10 seconds
-- Create a slot
postgres=# select pg_create_logical_replication_slot ('test',
'pgoutput', true, true);
pg_create_logical_replication_slot
------------------------------------
(test,0/1748068)
(1 row)
-- Wait for 10 seconds and execute checkpoint
postgres=# checkpoint;
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
The assert fails:
#5 0x00005b074f0c922f in ExceptionalCondition
(conditionName=0x5b074f2f0b4c "s->active_pid == 0",
fileName=0x5b074f2f0010 "slot.c", lineNumber=1762) at assert.c:66
#6 0x00005b074ee26ead in InvalidatePossiblyObsoleteSlot
(cause=RS_INVAL_INACTIVE_TIMEOUT, s=0x740925361780, oldestLSN=0,
dboid=0, snapshotConflictHorizon=0, invalidated=0x7fffaee87e63) at
slot.c:1762
#7 0x00005b074ee273b2 in InvalidateObsoleteReplicationSlots
(cause=RS_INVAL_INACTIVE_TIMEOUT, oldestSegno=0, dboid=0,
snapshotConflictHorizon=0) at slot.c:1952
#8 0x00005b074ee27678 in CheckPointReplicationSlots
(is_shutdown=false) at slot.c:2061
#9 0x00005b074e9dfda7 in CheckPointGuts (checkPointRedo=24412528,
flags=108) at xlog.c:7513
#10 0x00005b074e9df4ad in CreateCheckPoint (flags=108) at xlog.c:7179
#11 0x00005b074edc6bfc in CheckpointerMain (startup_data=0x0,
startup_data_len=0) at checkpointer.c:463
Regards,
Vignesh