Hi, On Tue, Mar 10, 2026 at 1:05 AM Xuneng Zhou <[email protected]> wrote: > > On Mon, Mar 9, 2026 at 11:28 PM Heikki Linnakangas <[email protected]> wrote: > > > > On 09/03/2026 17:02, Xuneng Zhou wrote: > > > Did you use Alexander’s reproducer script? I tried reproducing with a > > > 1 ms pg_usleep() added to all three functions that clear > > > MyProc->pendingRecoveryConflicts, but I still couldn’t reproduce the > > > issue. > > > > I used the attached, to be precise. With that it fails every time for > > me. I'm not sure if the "if (am_walsender)" check is necessary, I added > > it just to make the test run faster. > > > > - Heikki > > I was able to reproduce the issue using a wider sleep window as you > suggested and can confirm that the flag is not cleared after applying > the patch. Below are two logs—one from a successful run and one from a > failed run. I'll look further into the patch later on. > > failed run: > startup[1418915] LOG: DBG SignalRecoveryConflict target_pid=1419118 > reason=4 old_mask=0x0 new_mask=0x10 > walsender[1419118] LOG: DBG ProcArrayEndTransaction(no-xid) CLEARING > pendingRecoveryConflicts=0x10 > > successful run: > startup[1433218] LOG: DBG SignalRecoveryConflict target_pid=1433406 > reason=4 old_mask=0x0 new_mask=0x10 > walsender[1433406] LOG: DBG ProcessInterrupts handler fired 1 > time(s), pending=0x10 -- processing > walsender[1433406] ERROR: canceling statement due to conflict with recovery > > -- > Best, > Xuneng
I ran the script several times after applying the patch, and all tests passed without deadlocking. LGTM. One nit: should we separate the comment fix and the InitAuxiliaryProcess hardening into separate patches? -- Best, Xuneng
