Jan Wieck <[EMAIL PROTECTED]> writes:Tom Lane wrote:Maybe there should be a provision similar to the stats collector's check-for-read-ready-from-a-pipe?
the case of the bgwriter is a bit of a twist here. In contrast to the collectors it is connected to the shared memory. So it can keep resources and also even worse, it could write() after the postmaster died.
That's not "worse", really. Any backends that are still alive are committing real live transactions --- they're telling their clients they committed, so we'd better commit. I don't mind if performance gets worse or if we lose pg_stats statistics, but we'd better not adopt the attitude that transaction correctness no longer matters after a postmaster crash.
So one thing we ought to think about here is whether system correctness depends on the bgwriter continuing to run until the last backend is gone. AFAICS that is not true now --- the bgwriter just improves performance --- but we'd better decide what our plan for the future is.
Maybe there is a chance to create a watchdog for free here. Do we currently create our own process group, with all processes under the postmaster belonging to it?
We do not; I'm not sure the notion of a process group is even portable, and I am pretty sure that the API to control process grouping isn't.
If the bgwriter would at the times it naps check if its parent process is init, (Win32 note, check if the postmaster does not exist any more instead), it could kill the entire process group on behalf of the dead postmaster.
I don't think we want that. IMHO the preferred behavior if the postmaster crashes should be like a "smart shutdown" --- you don't spawn any more backends (obviously) but existing backends should be allowed to run until their clients exit. That's how things have always worked anyway...
[ thinks ... ] If we do want it we don't need any process-group assumptions. The bgwriter is connected to shmem so it can scan the PGPROC array and issue kill() against each sibling.
Right. Which can change the backend behaviour from a smart shutdown to an immediate shutdown. In the case of a postmaster crash, I think something in the system is so wrong that I'd prefer an immediate shutdown.
Jan
-- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== [EMAIL PROTECTED] #
---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster