On 08/01/2025 04:10, Andres Freund wrote:
I instead opted to somewhat rearrange the shutdown sequence:
This commit changes the shutdown sequence so checkpointer is signalled to
trigger writing the shutdown checkpoint without terminating it. Once
checkpointer wrote the checkpoint it will wait for a termination
signal.
Postmaster now triggers the shutdown checkpoint where we previously did so
by
terminating checkpointer. Checkpointer is terminated after all other
children
have been terminated, tracked using the new PM_SHUTDOWN_CHECKPOINTER
PMState.
In addition, the existing PM_SHUTDOWN_2 state is renamed to
PM_XLOG_IS_SHUTDOWN, that seems a lot more descriptive.
Sounds good.
This patch is not fully polished, there are a few details I'm not sure about:
- Previously the shutdown checkpoint and process exit were triggered from
HandleCheckpointerInterrupts(). I could have continued doing so in the
patch, but it seemed quite weird to have a wait loop in a function called
HandleCheckpointerInterrupts().
Instead I made the main loop in CheckpointerMain() terminate if checkpointer
was asked to write the shutdown checkpoint or exit
- In process_pm_pmsignal() the code now triggers a PostmasterStateMachine() if
checkpointer signalled ShutdownXLOG having completed. We didn't really have
a need for that before. process_pm_child_exit() does call
PostmasterStateMachine(), but somehow calling it in process_pm_pmsignal()
feels slightly off
Looks good to me.
- I don't love the naming of the various PMState values (existing and new),
but a larger renaming should probably be done separately?
We currently have PM_WAIT_BACKENDS and PM_WAIT_DEAD_END. Maybe follow
that example and name all the shutdown states after what you're waiting for:
PM_WAIT_BACKENDS, /* waiting for live backends to exit */
PM_WAIT_XLOG_SHUTDOWN, /* waiting for checkpointer to do shutdown
ckpt */
PM_WAIT_XLOG_ARCHIVAL, /* waiting for archiver and walsenders to
finish */
PM_WAIT_DEAD_END, /* waiting for dead-end children to exit */
PM_WAIT_CHECKPOINTER, /* waiting for checkpointer to shut down */
PM_NO_CHILDREN, /* all important children have exited */
- There's logging in ShutdownXLOG() that's only triggered if called from
checkpointer. Seems kinda odd - shouldn't we just move that to
checkpointer.c?
What logging do you mean? The "LOG: shutting down" message? No
objections to moving it, although it doesn't bother me either way.
The first two patches are just logging improvements that I found helpful to
write this patch:
0001: Instead of modifying pmState directly, do so via a new function
UpdatePMState(), which logs the state transition at DEBUG2.
Requires a helper to stringify PMState.
I've written one-off versions of this patch quite a few times. Without
knowing in what state postmaster is in, it's quite hard to debug the
state machine.
+1
elog(DEBUG1, "updating PMState from %d/%s to %d/%s",
pmState, pmstate_name(pmState), newState,
pmstate_name(newState));
I think the state names are enough, no need to print the numerical values.
0002: Make logging of postmaster signalling child processes more consistent
I found it somewhat hard to understand what's happening during state
changes without being able to see the signals being sent. While we did
have logging in SignalChildren(), we didn't in signal_child(), and most
signals that are important for the shutdown sequence are sent via
signal_child().
+1
I don't think the cases where DEBUG2 or DEBUG4 are used currently are
very well thought out. To save a few lines of code, I'd be happy with
merging signal_child_ext() and signal_child() and just always use DEBUG2
or DEBUG4. Or DEBUG3 as a compromise :-).
The next is a small prerequisite:
0003: postmaster: Introduce variadic btmask_all_except()
I proposed this in another thread, where I also needed
btmask_all_except3() but then removed the only call to
btmask_all_except2(). Instead of adding/removing code, it seems better
to just use a variadic function.
Nice. A variadic btmask_add() might be handy too.
And then 0004, the reason for this thread.
Overall, looks good to me.
--
Heikki Linnakangas
Neon (https://neon.tech)