On Tue, Mar 31, 2026 at 9:03 PM Fujii Masao <[email protected]> wrote: > > On Tue, Mar 31, 2026 at 7:42 PM shveta malik <[email protected]> wrote: > > > > One idea would be to prevent the restart altogether. For example, > > > > ProcessSlotSyncMessage() could set SlotSyncCtx->last_start_time to > > > > a special value (like -1), and SlotSyncWorkerCanRestart() could return > > > > false (i.e., prevent postmater from starting up slotsync worker) when > > > > it sees that. Alternatively, SlotSyncWorkerCanRestart() could simply > > > > check SlotSyncCtx->stopSignaled. > > > > > > > > That said, as far as I remember correctly, postmaster is generally not > > > > supposed to touch shared memory (per the comments in postmaster.c), > > > > so I'm not sure this approach is acceptable. On the other hand, > > > > postmaster and the slotsync worker already rely on > > > > SlotSyncCtx->last_start_time, > > > > so perhaps there's some precedent here. > > > > > > > IIUC, checking SlotSyncCtx->stopSignaled in SlotSyncWorkerCanRestart() > > > may not be ideal, as it requires a spinlock to avoid races with the > > > startup process and it is disallowed to take lock in postmaster main > > > loop. Whereas, SlotSyncCtx->last_start_time doesn’t need a lock since > > > the postmaster accesses it only when the worker is not alive. > > > > > > > I agree. > > Could you clarify what issue might arise from checking > SlotSyncCtx->stopSignaled without holding a spinlock in > SlotSyncWorkerCanRestart()? Is it actually problematic? >
We might not see issues in practice since stopSignaled changes only once (false -> true), so value corruption is unlikely. But, without a lock or memory barrier, correct value-read is not guaranteed, e.g., on weakly ordered systems (like ARM64) the postmaster may still see a stale value. This means the worker could be restarted again, and the same unwanted log may still appear. > That said, since the postmaster should generally avoid > touching shared memory, it doesn't seem like a good idea > for it to check SlotSyncCtx->stopSignaled. So I'm fine with > instead lowering the log level for the "worker will not start" > message to DEBUG1. > Okay, thanks. I'll share the updated patch soon. -- Thanks, Nisha
