On Tue, Jun 16, 2026 at 6:54 PM Dilip Kumar <[email protected]> wrote:
>
> On Mon, Jun 8, 2026 at 3:10 PM shveta malik <[email protected]> wrote:
> >
> > v46-0002:
> >
> > 1)
> > I was trying to verify TRY-CATCH block of ProcessPendingConflictLogTuple().
> >
> > When I force InsertConflictLogTuple() to fail while atatching
> > debugger, I see a new error due to CATCH block. See this dump:
> >
> > -----------------
> > [27532] WARNING:  could not log conflict to table for subscription
> > "sub1": cannot open relation "pg_conflict_log_16391"
> > [27532] ERROR:  errstart was not called
> > [27102] LOG:  background worker "logical replication apply worker"
> > (PID 27532) exited with exit code 1
> > [27548] LOG:  logical replication apply worker for subscription "sub1"
> > has started
> > [27548] ERROR:  conflict detected on relation "public.tab1":
> > conflict=insert_exists
> > [27548] DETAIL:  Could not apply remote change: remote row (4).
> >         Key already exists in unique index "tab1_pkey", modified in
> > transaction 793: key (i)=(4), local row (4).
> > [27548] CONTEXT:  processing remote data for replication origin "pg_16391...
> > -----------------
> >
> > 'ERROR:  errstart was not called' is raised perhaps due to
> > 'FlushErrorState' which sets errordata_stack_depth to -1. If I get rid
> > of FlushErrorState(), the internal ERROR is not cleared, which results
> > in the worker exiting (which we are trying to avoid).
> >
> > -------------------------
> > [30031] WARNING:  could not log conflict to table for subscription
> > "sub1": cannot open relation "pg_conflict_log_16391"
> > [30031] ERROR:  cannot open relation "pg_conflict_log_16391"
> > ------------->this needs to be handled.
> > [30031] DETAIL:  This operation is not supported for tables.
> > [30011] LOG:  background worker "logical replication apply worker"
> > (PID 30031) exited with exit code 1
> > [30043] LOG:  logical replication apply worker for subscription "sub1"
> > has started
> > [30043] ERROR:  conflict detected on relation "public.tab1":
> > conflict=insert_exists
> > [30043] DETAIL:  Could not apply remote change: remote row (12).
> >         Key already exists in unique index "tab1_pkey", modified in
> > transaction 872: key (i)=(12), local row (12).
> > ------------------------
> >
> > I am still thinking how this can be done cleanly. Meanwhile putting it
> > here for others to review/comment.
>
> I am able to reproduce the error, I will put more thoughts and propose
> the fix for this.
>
> >
> > 2)
> > Also, I think InsertConflictLogTuple() in the non-error path (via
> > ReportApplyConflict()) should be wrapped in its own TRY-CATCH block.
> > When I force an error during that insert, execution falls through to
> > the start_apply CATCH block, which then attempts to insert the same
> > conflict record again via ProcessPendingConflictLogTuple(). That
> > insert fails again for the same reason, causing the apply worker to
> > error out.
> >
> > Should we keep this behavior and allow the apply worker to halt on a
> > CLT insertion failure, or would it be better to avoid disrupting
> > replication by encapsulating the insertion logic in its own TRY-CATCH
> > block and handling the issue locally by emiting it as WaRNING?
> > Thoughts?
>
> IMHO we should just log WARNING and continue the apply worker on
> conflict insertion failure, lets see what other thinks on this.
>

I have the same opinion. Allowing CLT to block the apply worker would
be undesirable; CLT is a history/logs collection feature and should
not interrupt core logical replication work.

thanks
Shveta


Reply via email to