On Sat, Aug 9, 2014 at 2:00 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> +1. I think the current behaviour is a seriously bad idea.
> I don't think it's anywhere near as black-and-white as you guys claim.
> What it comes down to is whether allowing existing transactions/sessions
> to finish is more important than allowing new sessions to start.
> Depending on the application, either could be more important.
It's partly about that, and I think the answer is that being able to
start new sessions is almost always more important; but it's also
about about the fact that the postmaster provides essential
protections against data corruption, and running without those
protections is a bad idea. If it's not a bad idea, then why do we
need those protections ever? Why have we put so much effort into
bullet-proofing them over the years?
I mean, we could simply regard the unexpected end of a backend as
being something that is "probably OK" and we'd usually be right; after
all, a backend would crap out without releasing a critical spinlock
very often. A lot of users would probably be very happy to be
liberated from the tyranny of a server-wide restart every time a
backend crashes, and 90% of the time nothing bad would happen. But
clearly this is insanity, because every now and then something would
go terribly wrong and there would be no automated way for the system
to recover, and on even rarer occasions your data would get eaten.
That is why it is right to think that the service provided by the
postmaster is essential, not nice-to-have.
> Ideally we'd have some way to configure the behavior appropriately for
> a given installation; but short of that, it's unclear to me that
> unilaterally changing the system's bias is something our users would
> thank us for. I've not noticed a large groundswell of complaints about
> it (though this may just reflect that we've made the postmaster pretty
> darn robust, so that the case seldom comes up).
I do think that's a large part of it. The postmaster doesn't get
killed very often, and when it does, things are often messed up to a
degree where the user's just going to reboot anyway. But I've
encountered customers who managed to corrupt their database because
backends didn't exit when the postmaster died, because it turns out
that removing postmaster.pid defeats the shared memory interlocks that
normally prevent starting a new postmaster, and the customer did that.
And I've personally experienced at least one protracted outage that
resulted from orphaned backends preventing 'pg_ctl restart' from
working. If the postmaster weren't so reliable, I'm sure these kinds
of problems would be a lot more common.
But the fact that they're uncommon doesn't mean that the current
behavior is the best one, and I'm convinced that it isn't.
The Enterprise PostgreSQL Company
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: