Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

shveta malik Thu, 20 Nov 2025 02:32:25 -0800

On Thu, Nov 20, 2025 at 3:21 PM Masahiko Sawada <[email protected]> wrote:
>
> On Thu, Nov 20, 2025 at 1:27 AM shveta malik <[email protected]> wrote:
> >
> > On Wed, Nov 19, 2025 at 11:04 PM Masahiko Sawada <[email protected]> 
> > wrote:
> > >
> > > On Wed, Nov 19, 2025 at 1:21 AM shveta malik <[email protected]> 
> > > wrote:
> > > >
> > > > On Wed, Nov 19, 2025 at 1:52 AM Masahiko Sawada <[email protected]> 
> > > > wrote:
> > > > >
> > > > > On Mon, Nov 17, 2025 at 10:49 PM Amit Kapila 
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, Nov 18, 2025 at 1:02 AM Masahiko Sawada 
> > > > > > <[email protected]> wrote:
> > > > > > >
> > > > > > > On Sun, Nov 16, 2025 at 9:51 PM Amit Kapila 
> > > > > > > <[email protected]> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > In v26-0002-FIXUP-remove-status_change_allowed-flag, by using
> > > > > > > > status_change_inprogress, we ensure that no backend is allowed 
> > > > > > > > to
> > > > > > > > toggle the logical_wal/decoding status till startup process 
> > > > > > > > marks the
> > > > > > > > recovery state as recovery_done. I am trying to think what 
> > > > > > > > problem
> > > > > > > > this part of design prevents. I have considered the following
> > > > > > > > scenarios:
> > > > > > > >
> > > > > > > > Scenario-1:
> > > > > > > > 1. Startup process enables logical_wal and logical_decoding. 
> > > > > > > > Writes
> > > > > > > > WAL record for it
> > > > > > > > 2. Backend disables logical_decoding, writes WAL for it, and 
> > > > > > > > disables
> > > > > > > > logical_wal.
> > > > > > > > 3. Startup process sets recovery_done and allows wal_writes
> > > > > > > >
> > > > > > > > Say, instead of using status_change_inprogress to prevent doing
> > > > > > > > step-2, if we had used recovery_in_progress kind of flag then 
> > > > > > > > how is
> > > > > > > > it possible for backends to create any problem for the current 
> > > > > > > > node or
> > > > > > > > cascaded standbys? I think the only way a problem can happen is 
> > > > > > > > if we
> > > > > > > > write the WAL to disable_logical decoding after any backend 
> > > > > > > > could have
> > > > > > > > written a non-logical WAL information record. Can that happen 
> > > > > > > > if we
> > > > > > > > use the recovery_in_progress flag to prevent disable of 
> > > > > > > > logical_wal?
> > > > > > > > If so, how?
> > > > > > >
> > > > > > > The main idea of holding status_change_inprogress until the 
> > > > > > > recovery
> > > > > > > end is to prevent concurrent toggling the logical decoding 
> > > > > > > status. In
> > > > > > > your scenario, IIUC backends cannot write any WAL yet at step-2 
> > > > > > > since
> > > > > > > it's allowed at step-3. It would end up with a FATAL error 
> > > > > > > actually.
> > > > > > > One alternative is to make processes call 
> > > > > > > LocalSetXLogInsertAllowed()
> > > > > > > so that they can write WAL even during recovery, but I don't use 
> > > > > > > it as
> > > > > > > I'm concerned that it could lead to other problems. On the other 
> > > > > > > hand,
> > > > > > > we cannot let the backend to disable logical_decoding and 
> > > > > > > logical_wal
> > > > > > > without WAL warite at step-2 because otherwise the cascaded 
> > > > > > > standby
> > > > > > > won't disable logical decoding.
> > > > > > >
> > > > > >
> > > > > > Why can't we postpone disabling logical WAL, decoding to the next
> > > > > > cycle of checkpointer when RecoveryInProgress() is true without
> > > > > > relying on status_change_inprogress? So, this will lead to a window
> > > > > > where there are no logical slots but still the effective_wal_level 
> > > > > > is
> > > > > > logical. However, this could be true even without considering this
> > > > > > problem because the checkpointer can take some time to disable the
> > > > > > logical WAL and decoding.
> > > > > >
> > > > > > The other problematic case to consider is during promotion, the
> > > > > > startup has marked logical decoding as disabled but not yet marked
> > > > > > recovery-done. Then the backend created a slot and returned without
> > > > > > marking logical decoding as enabled due to relying on
> > > > > > RecoveryInProgress(). Then the start-up marked Recovery-Done. Now we
> > > > > > have a logical slot present, but logical decoding is disabled. I 
> > > > > > think
> > > > > > we can simply disallow the creation of a logical slot in this window
> > > > > > (where effective_wal_level is 'replica' and RecoveryInProgress() is
> > > > > > true).
> > > > >
> > > > > It sounds reasonable. Backends are already prohibited from creating
> > > > > logical slots when effective_wal_level is 'replica' and
> > > > > RecoveryInProgress() is true, so it should not be a problem.
> > > > >
> > > > > > If the above is feasible and sounds reasonable, then we don't even
> > > > > > need the status_change_inprogress flag, at least not during the
> > > > > > start-up flow.
> > > > >
> > > > > I've updated the patch based on the above suggestion. I believe we
> > > > > still need the status_change_inprogress flag when not in recovery but
> > > > > in the new version I don't use the flag during end-of-recovery action.
> > > > >
> > > >
> > > > Thanks for the patch.
> > > >
> > > > I had a look at the new changes. It appears that the startup process
> > > > now marks the state as pending_disable rather than directly disabling
> > > > logical decoding itself. In this setup, I think the situation Amit
> > > > described will no longer occur— the case where the startup process
> > > > disables logical decoding, and during the brief window before
> > > > recovery-done is set, a slot-creation attempt is issued and gets
> > > > blocked.
> > > >
> > > > Currently, in the scenario where the primary has one slot and the
> > > > standby has zero slots, if I promote the standby and then try to
> > > > create another slot in parallel (after
> > > > UpdateLogicalDecodingStatusEndOfRecovery but before marking
> > > > RECOVERY_STATE_DONE), the slot creation proceeds (instead of being
> > > > blocked) and eventually hangs in DecodingContextFindStartpoint()—it
> > > > sleeps in read_local_xlog_page_guts() until the promotion completes.
> > > >
> > > > IIUC, Amit’s original idea was to disable logical decoding within the
> > > > startup process itself rather than doing it lazily and block the
> > > > slot-creation in parallel.
> > >
> > > Indeed. I misunderstood the idea. The attached new version patch
> > > should implement the idea correctly.
> > >
> >
> > Thanks. I will review.
> >
> > On reconsideration, it looks like the 'status_change_inprogress' flag
> > can be removed entirely.
> >
> > The scenario where EnsureLogicalDecodingEnabled() is trying to enable
> > logical decoding while DisableLogicalDecodingIfNecessary() is
> > attempting to disable it cannot occur in practice. This is because a
> > drop-logical-slot operation (for the only slot present) would fail in
> > parallel if the slot is already held by a backend in
> > EnsureLogicalDecodingEnabled().
>
> Could you elaborate on this case? IIUC we attempt to disable after
> dropping the slot, so it could happen that
> DisableLogicalDecodingIfNecessary() and EnsureLogicalDecodingEnabled()
> are executed concurrently.
>


I was referring to the scenario where backend1 triggers the creation
of slot1 and is executing EnsureLogicalDecodingEnabled(), while
backend2 attempts to drop the same slot1 in parallel. In that
situation, the drop will fail outright and won’t even reach
DisableLogicalDecodingIfNecessary(), producing an error like:
ERROR: replication slot "slot1" is active for PID <>.

Now, consider the case where a slot already exists (say slot1).
Suppose EnsureLogicalDecodingEnabled() is invoked for another slot
(slot2) while DisableLogicalDecodingIfNecessary() is triggered
concurrently to drop a pre-existing slot (slot1).

If EnsureLogicalDecodingEnabled() acquires the lock first via
start_logical_decoding_status_change(), it will essentially be a no-op
since logical decoding is already enabled. Once the lock is released,
the disable operation may proceed, which will also be no-op as
logical-slot exists. This will leave logical-decoding in the enabled
state, which is expected.

If DisableLogicalDecodingIfNecessary() acquires the lock first, then,
as mentioned earlier, making the first four steps atomic and releasing
the lock only after disabling xlog_logical_info ensures safety. If
somehow EnsureLogicalDecodingEnabled() executes between step 4 and
step 5 of the disable process, it will re-enable logical decoding. And
even if step 5 of DisableLogicalDecodingIfNecessary() emits the signal
later, it will be of no harm, as xlog_logical_info will reflect the
correct state (true) because EnsureLogicalDecodingEnabled() has
already re-enabled it.

thanks
Shveta

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

Reply via email to