Hi On Wed, May 20, 2026 at 11:49 PM vignesh C <[email protected]> wrote:
> On Mon, 11 May 2026 at 08:31, Fujii Masao <[email protected]> wrote: > > > > On Sun, May 10, 2026 at 5:45 AM SATYANARAYANA NARLAPURAM > > <[email protected]> wrote: > > > > > > Hi Hackers, > > > > > > SQL-callable replication slot functions acquire a slot (setting > > > the process-global MyReplicationSlot) but can then ERROR before > reaching > > > ReplicationSlotRelease(). If such an error is caught by a PL/pgSQL > > > EXCEPTION block (which uses a subtransaction), MyReplicationSlot > remains > > > set because there is no subtransaction-level cleanup hook for > replication > > > slots. > > > > > > Any subsequent slot operation in the same session then hits > > > Assert(MyReplicationSlot == NULL) and crashes the backend on assert > > > enabled builds. In release builds the stale MyReplicationSlot is > silently overwritten, > > > permanently orphaning the old slot as "active." The orphaned slot > blocks any other > > > session from acquiring it, vacuum and WAL deletion. > > > > > > Repro: > > > > > > SELECT pg_create_logical_replication_slot('adv_test', 'test_decoding'); > > > > > > DO $$ BEGIN > > > PERFORM pg_replication_slot_advance('adv_test', '0/1'::pg_lsn); > > > EXCEPTION WHEN others THEN > > > RAISE NOTICE 'caught: %', SQLERRM; > > > END $$; > > > > > > SELECT count(*) FROM pg_logical_slot_get_changes('adv_test', NULL, > NULL); > > > > > > 2026-05-09 19:45:06.619 UTC [1096805] STATEMENT: SELECT > pg_create_logical_replication_slot('adv_test', 'test_decoding'); > > > TRAP: failed Assert("MyReplicationSlot == NULL"), File: "slot.c", > Line: 638, PID: 1096805 > > > > > > > > > Attached a patch to address this by wrapping error-prone paths in > PG_TRY/PG_CATCH blocks > > > and call ReplicationSlotRelease(). > > > > Thanks for the report and the patch! > > > > I think wrapping the slot-processing code with PG_TRY()/PG_CATCH() seems > > a good direction for addressing the issue you reported. > > > > > > + PG_CATCH(); > > + { > > + ReplicationSlotRelease(); > > > > When create_logical_replication_slot() is called with temporary = true, > > the created logical replication slot has RS_TEMPORARY persistency. Such > a slot > > is not dropped by ReplicationSlotRelease(), whereas an RS_EPHEMERAL slot > is > > dropped via ReplicationSlotDropAcquired(). > > > > So even with the v1 patch, a temporary logical replication slot can > remain > > unexpectedly if pg_create_logical_replication_slot() throws an error. > > In this case, should create_logical_replication_slot() explicitly drop > the slot > > with ReplicationSlotDropAcquired(), or temporarily change the slot > persistency > > to RS_EPHEMERAL before calling ReplicationSlotRelease()? > > > > > > Does a newly created logical replication slot created by > > pg_copy_logical_replication_slot() have the same issue? > > Additionally pg_logical_slot_get_changes also has the same issue, it > can be reproduced by the following: > SELECT pg_create_logical_replication_slot('test_slot_1', 'test_decoding'); > > DO $$ > BEGIN > -- This will ERROR if the slot_get changes fails for the slot. > PERFORM 1 FROM pg_logical_slot_get_changes('test_slot_1', NULL, > NULL, 'nonexistent-option', 'val'); > EXCEPTION WHEN others THEN > RAISE NOTICE 'caught: %', SQLERRM; > END $$; > > SELECT count(*) FROM pg_logical_slot_get_changes('test_slot_1', NULL, > NULL); > > TRAP: failed Assert("MyReplicationSlot == NULL"), File: "slot.c", > Line: 638, PID: 80308 > postgres: vignesh postgres [local] SELECT(ExceptionalCondition+0xba) > [0x642e7b2ebae1] > postgres: vignesh postgres [local] SELECT(ReplicationSlotAcquire+0x6e) > [0x642e7b00d732] Thank you for letting me know. Fixing these cases in the next update, will send it shortly. Thanks, Satya > > >
