Re: [HACKERS] bgwriter never dies

Jan Wieck Tue, 24 Feb 2004 04:35:46 -0800

Tom Lane wrote:

Jan Wieck <[EMAIL PROTECTED]> writes:
Tom Lane wrote:
Maybe there should be a provision similar to the stats collector's
check-for-read-ready-from-a-pipe?
the case of the bgwriter is a bit of a twist here. In contrast to the collectors it is connected to the shared memory. So it can keep resources and also even worse, it could write() after the postmaster died.
That's not "worse", really.  Any backends that are still alive are
committing real live transactions --- they're telling their clients
they committed, so we'd better commit.  I don't mind if performance gets
worse or if we lose pg_stats statistics, but we'd better not adopt the
attitude that transaction correctness no longer matters after a
postmaster crash.
So one thing we ought to think about here is whether system correctness
depends on the bgwriter continuing to run until the last backend is
gone.  AFAICS that is not true now --- the bgwriter just improves
performance --- but we'd better decide what our plan for the future is.
Maybe there is a chance to create a watchdog for free here. Do we currently create our own process group, with all processes under the postmaster belonging to it?
We do not; I'm not sure the notion of a process group is even portable,
and I am pretty sure that the API to control process grouping isn't.
If the bgwriter would at the times it naps check if its parent process is init, (Win32 note, check if the postmaster does not exist any more instead), it could kill the entire process group on behalf of the dead postmaster.
I don't think we want that.  IMHO the preferred behavior if the
postmaster crashes should be like a "smart shutdown" --- you don't spawn
any more backends (obviously) but existing backends should be allowed to
run until their clients exit.  That's how things have always worked
anyway...
[ thinks ... ]  If we do want it we don't need any process-group
assumptions.  The bgwriter is connected to shmem so it can scan the
PGPROC array and issue kill() against each sibling.

Right. Which can change the backend behaviour from a smart shutdown to an immediate shutdown. In the case of a postmaster crash, I think something in the system is so wrong that I'd prefer an immediate shutdown.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== [EMAIL PROTECTED] #


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] bgwriter never dies

Reply via email to