Re: Refactor recovery conflict signaling a little

Xuneng Zhou Mon, 09 Mar 2026 00:00:12 -0700

Hi Alexander,

On Sat, Mar 7, 2026 at 7:00 PM Alexander Lakhin <[email protected]> wrote:
>
> Hello Xuneng and Heikki,
>
> 04.03.2026 07:33, Xuneng Zhou wrote:
>
> 03.03.2026 17:39, Heikki Linnakangas wrote:
>
> On 24/02/2026 10:00, Alexander Lakhin wrote:
>
> The "terminating process ..." message doesn't appear when the test passes
> successfully.
>
> Hmm, right, looks like something wrong in signaling the recovery conflict. I 
> can't tell if the signal is being sent,
> or it's not processed correctly. Looking at the code, I don't see anything 
> wrong.
>
> I was unable to reproduce the issue on an x86_64 Linux machine using
> the provided script. All test runs completed successfully without any
> failures.
>
>
> I've added debug logging (see attached) and saw the following:
> !!!SignalRecoveryConflict[282363]
> !!!ProcArrayEndTransaction| pendingRecoveryConflicts = 0
> !!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
> !!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
> 2026-03-07 12:21:24.544 EET walreceiver[282421] FATAL:  could not receive 
> data from WAL stream: server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> 2026-03-07 12:21:24.645 EET postmaster[282355] LOG:  received immediate 
> shutdown request
> 2026-03-07 12:21:24.647 EET postmaster[282355] LOG:  database system is shut 
> down
>
> While for a successful run, I see:
> 2026-03-07 12:18:17.075 EET startup[285260] DETAIL:  The slot conflicted with 
> xid horizon 677.
> 2026-03-07 12:18:17.075 EET startup[285260] CONTEXT:  WAL redo at 0/04022130 
> for Heap2/PRUNE_ON_ACCESS: snapshotConflictHorizon: 677, isCatalogRel: T, 
> nplans: 0, nredirected: 0, ndead: 2, nunused: 0, dead: [35, 36]; blkref #0: 
> rel 1663/16384/16418, blk 10
> !!!SignalRecoveryConflict[285260]
> !!!ProcessInterrupts[286071]| MyProc->pendingRecoveryConflicts: 16
> !!!ProcessRecoveryConflictInterrupts[286071]
> !!!ProcessRecoveryConflictInterrupts[286071] pending: 16, reason: 4
> 2026-03-07 12:18:17.075 EET walsender[286071] 035_standby_logical_decoding.pl 
> ERROR:  canceling statement due to conflict with recovery
> 2026-03-07 12:18:17.075 EET walsender[286071] 035_standby_logical_decoding.pl 
> DETAIL:  User was using a logical replication slot that must be invalidated.
>
> (Full logs for this failed run and a good run are attached.)
>


Thanks again for the logs. I think we might frame the next run around
three explicit hypotheses:

1) Self-clear: the target walsender clears its own
pendingRecoveryConflicts during xact cleanup before
ProcessInterrupts() consumes it.
2) Group-clear-on-behalf: another backend clears the target PGPROC via
group clear before consume.
3) Visibility gap: the weak pg_atomic_read_u32() in the interrupt path
observes 0 even though the conflict bit was set.

The bad vs good traces are consistent with a receiver-side handoff failure:
   - bad run: signal path reached, target walsender later sees
pendingRecoveryConflicts = 0, never dispatches conflict;
   - good run: target sees pending = 0x10, dispatches reason 4
(RECOVERY_CONFLICT_LOGICALSLOT), throws expected ERROR, releases slot.

I prepared an updated diagnostics patch to classify hypotheses (1) and
(2) directly:
   - SignalRecoveryConflict: logs target pid/procno and old/new mask
   - clear sites: logs self_pid, target_pid, target_procno, self_is_target
   - ProcessInterrupts: logs handler counter + pending-mask observations

Two caveats:
   - The weak/strong telemetry is supportive but not definitive for
hypothesis (3); it may miss the exact stale-read window.
   - The patch preserves the original dispatch path, but added
diagnostics (especially barriered reads in clear paths) can perturb
timing, so repro rate may shift.

What to look for:
Hypothesis (1) self-clear:
ProcArrayEndTransaction... self_is_target=t ... clearing
pendingRecoveryConflicts=0x10 (or ProcArrayClearTransaction self case)
for target pid/procno before target ProcessInterrupts reports handler
fired but pending=0.

Hypothesis (2) group-clear-on-behalf:
ProcArrayEndTransactionInternal... self_is_target=f ...
target_procno=<walsender procno> ... clearing
pendingRecoveryConflicts=0x10 (or ProcArrayClearTransaction with
self_is_target=f).

Hypothesis (3) visibility gap (suggestive):
no matching clear-site log for target procno, but target still reports
handler fired with pending=0.

If this run is inconclusive, I suggest a direct behavioral A/B as next step:
   - change only the ProcessInterrupts() recovery-conflict check from
pg_atomic_read_u32() to pg_atomic_read_membarrier_u32();
   - if failures disappear, that strongly supports hypothesis (3).

--
Best,
Xuneng

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 40312df2cac..20207ca3af4 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -709,6 +709,28 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
 		/* be sure this is cleared in abort */
 		proc->delayChkptFlags = 0;
 
+		{
+			uint32		old_mask;
+			bool		self_is_target;
+
+			/* Avoid strong read cost on the common zero case. */
+			old_mask = pg_atomic_read_u32(&proc->pendingRecoveryConflicts);
+			if (old_mask != 0)
+			{
+				old_mask = pg_atomic_read_membarrier_u32(&proc->pendingRecoveryConflicts);
+				self_is_target = (proc == MyProc);
+				if (old_mask != 0)
+					ereport(LOG,
+							(errmsg("DBG ProcArrayEndTransaction self_pid=%d "
+									"target_pid=%d target_procno=%d self_is_target=%c "
+									"clearing pendingRecoveryConflicts=0x%x",
+									getpid(), (int) proc->pid,
+									(int) GetNumberFromPGProc(proc),
+									self_is_target ? 't' : 'f',
+									old_mask)));
+			}
+		}
+
 		pg_atomic_write_u32(&proc->pendingRecoveryConflicts, 0);
 
 		/* must be cleared with xid/xmin: */
@@ -751,6 +773,29 @@ ProcArrayEndTransactionInternal(PGPROC *proc, TransactionId latestXid)
 	/* be sure this is cleared in abort */
 	proc->delayChkptFlags = 0;
 
+	{
+		uint32		old_mask;
+		bool		self_is_target;
+
+		/* Avoid strong read cost on the common zero case. */
+		old_mask = pg_atomic_read_u32(&proc->pendingRecoveryConflicts);
+		if (old_mask != 0)
+		{
+			old_mask = pg_atomic_read_membarrier_u32(&proc->pendingRecoveryConflicts);
+			self_is_target = (proc == MyProc);
+			if (old_mask != 0)
+				ereport(LOG,
+						(errmsg("DBG ProcArrayEndTransactionInternal self_pid=%d "
+								"target_pid=%d target_procno=%d self_is_target=%c "
+								"clearing "
+								"pendingRecoveryConflicts=0x%x",
+								getpid(), (int) proc->pid,
+								(int) GetNumberFromPGProc(proc),
+								self_is_target ? 't' : 'f',
+								old_mask)));
+		}
+	}
+
 	pg_atomic_write_u32(&proc->pendingRecoveryConflicts, 0);
 
 	/* must be cleared with xid/xmin: */
@@ -910,6 +955,11 @@ void
 ProcArrayClearTransaction(PGPROC *proc)
 {
 	int			pgxactoff;
+	bool		log_clear = false;
+	int			target_procno = INVALID_PROC_NUMBER;
+	pid_t		target_pid = 0;
+	uint32		old_mask = 0;
+	bool		self_is_target = false;
 
 	/*
 	 * Currently we need to lock ProcArrayLock exclusively here, as we
@@ -934,6 +984,21 @@ ProcArrayClearTransaction(PGPROC *proc)
 
 	proc->vxid.lxid = InvalidLocalTransactionId;
 	proc->xmin = InvalidTransactionId;
+
+	/* Avoid strong read cost on the common zero case. */
+	old_mask = pg_atomic_read_u32(&proc->pendingRecoveryConflicts);
+	if (old_mask != 0)
+	{
+		old_mask = pg_atomic_read_membarrier_u32(&proc->pendingRecoveryConflicts);
+		if (old_mask != 0)
+		{
+			log_clear = true;
+			target_pid = proc->pid;
+			target_procno = GetNumberFromPGProc(proc);
+			self_is_target = (proc == MyProc);
+		}
+	}
+
 	pg_atomic_write_u32(&proc->pendingRecoveryConflicts, 0);
 
 	Assert(!(proc->statusFlags & PROC_VACUUM_STATE_MASK));
@@ -960,6 +1025,14 @@ ProcArrayClearTransaction(PGPROC *proc)
 	}
 
 	LWLockRelease(ProcArrayLock);
+
+	if (log_clear)
+		ereport(LOG,
+				(errmsg("DBG ProcArrayClearTransaction self_pid=%d target_pid=%d "
+						"target_procno=%d self_is_target=%c "
+						"clearing pendingRecoveryConflicts=0x%x",
+						getpid(), (int) target_pid, target_procno,
+						self_is_target ? 't' : 'f', old_mask)));
 }
 
 /*
@@ -3458,6 +3531,9 @@ bool
 SignalRecoveryConflict(PGPROC *proc, pid_t pid, RecoveryConflictReason reason)
 {
 	bool		found = false;
+	int			procno = INVALID_PROC_NUMBER;
+	uint32		old_mask = 0;
+	uint32		new_mask = 0;
 
 	LWLockAcquire(ProcArrayLock, LW_SHARED);
 
@@ -3467,15 +3543,24 @@ SignalRecoveryConflict(PGPROC *proc, pid_t pid, RecoveryConflictReason reason)
 	 */
 	if (proc->pid == pid)
 	{
-		(void) pg_atomic_fetch_or_u32(&proc->pendingRecoveryConflicts, (1 << reason));
+		procno = GetNumberFromPGProc(proc);
+		old_mask = pg_atomic_fetch_or_u32(&proc->pendingRecoveryConflicts,
+										  (1 << reason));
+		new_mask = old_mask | (1 << reason);
 
 		/* wake up the process */
-		(void) SendProcSignal(pid, PROCSIG_RECOVERY_CONFLICT, GetNumberFromPGProc(proc));
+		(void) SendProcSignal(pid, PROCSIG_RECOVERY_CONFLICT, procno);
 		found = true;
 	}
 
 	LWLockRelease(ProcArrayLock);
 
+	if (found)
+		ereport(LOG,
+				(errmsg("DBG SignalRecoveryConflict self_pid=%d target_pid=%d "
+						"target_procno=%d reason=%d old_mask=0x%x new_mask=0x%x",
+						getpid(), pid, procno, (int) reason, old_mask, new_mask)));
+
 	return found;
 }
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d01a09dd0c4..862f9e960c9 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3070,11 +3070,15 @@ FloatExceptionHandler(SIGNAL_ARGS)
  * Tell the next CHECK_FOR_INTERRUPTS() to process recovery conflicts.  Runs
  * in a SIGUSR1 handler.
  */
+static volatile sig_atomic_t dbg_sigusr1_conflict_count = 0;
+
 void
 HandleRecoveryConflictInterrupt(void)
 {
 	if (pg_atomic_read_u32(&MyProc->pendingRecoveryConflicts) != 0)
 		InterruptPending = true;
+	if (dbg_sigusr1_conflict_count < INT_MAX)
+		dbg_sigusr1_conflict_count++;
 	/* latch will be set by procsignal_sigusr1_handler */
 }
 
@@ -3296,6 +3300,10 @@ ProcessRecoveryConflictInterrupts(void)
 	Assert(!proc_exit_inprogress);
 	Assert(InterruptHoldoffCount == 0);
 
+	ereport(LOG,
+			(errmsg("DBG ProcessRecoveryConflictInterrupts[%d] enter",
+					getpid())));
+
 	/* Are any recovery conflict pending? */
 	pending = pg_atomic_read_membarrier_u32(&MyProc->pendingRecoveryConflicts);
 	if (pending == 0)
@@ -3313,6 +3321,11 @@ ProcessRecoveryConflictInterrupts(void)
 	{
 		if ((pending & (1 << reason)) != 0)
 		{
+			ereport(LOG,
+					(errmsg("DBG ProcessRecoveryConflictInterrupts[%d] "
+							"dispatching reason=%d pending=0x%x",
+							getpid(), (int) reason, pending)));
+
 			/* clear the flag */
 			(void) pg_atomic_fetch_and_u32(&MyProc->pendingRecoveryConflicts, ~(1 << reason));
 
@@ -3506,6 +3519,52 @@ ProcessInterrupts(void)
 		}
 	}
 
+	{
+		int			handler_count;
+		uint32		weak;
+		uint32		strong;
+
+		handler_count = (int) dbg_sigusr1_conflict_count;
+		weak = pg_atomic_read_u32(&MyProc->pendingRecoveryConflicts);
+
+		/*
+		 * Don't pay for a strong read in the common case with no signal and no
+		 * pending conflicts.
+		 */
+		if (handler_count > 0 || weak != 0)
+		{
+			/*
+			 * This handoff is intentionally approximate for diagnostics: a signal
+			 * can arrive between reading and clearing this counter.
+			 */
+			dbg_sigusr1_conflict_count = 0;
+			strong = pg_atomic_read_membarrier_u32(&MyProc->pendingRecoveryConflicts);
+
+			if (weak != strong)
+				ereport(LOG,
+						(errmsg("DBG ProcessInterrupts[%d] weak/strong mismatch "
+								"weak=0x%x strong=0x%x handler_count=%d",
+								getpid(), weak, strong, handler_count)));
+			else if (handler_count > 0 && weak == 0)
+				ereport(LOG,
+						(errmsg("DBG ProcessInterrupts[%d] handler fired but "
+								"pending=0x0 (spurious wakeup, cleared, or "
+								"counter-race), handler_count=%d",
+								getpid(), handler_count)));
+			else if (handler_count > 0)
+				ereport(LOG,
+						(errmsg("DBG ProcessInterrupts[%d] handler fired "
+								"pending=0x%x handler_count=%d",
+								getpid(), weak, handler_count)));
+			else
+				ereport(LOG,
+						(errmsg("DBG ProcessInterrupts[%d] no handler but "
+								"pending=0x%x (preexisting)",
+								getpid(), weak)));
+		}
+	}
+
+	/* Preserve original dispatch behavior: weak read at point of check. */
 	if (pg_atomic_read_u32(&MyProc->pendingRecoveryConflicts) != 0)
 		ProcessRecoveryConflictInterrupts();

Re: Refactor recovery conflict signaling a little

Reply via email to