On Sat, Aug 9, 2014 at 2:00 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> +1. I think the current behaviour is a seriously bad idea. > > I don't think it's anywhere near as black-and-white as you guys claim. > What it comes down to is whether allowing existing transactions/sessions > to finish is more important than allowing new sessions to start. > Depending on the application, either could be more important.
It's partly about that, and I think the answer is that being able to start new sessions is almost always more important; but it's also about about the fact that the postmaster provides essential protections against data corruption, and running without those protections is a bad idea. If it's not a bad idea, then why do we need those protections ever? Why have we put so much effort into bullet-proofing them over the years? I mean, we could simply regard the unexpected end of a backend as being something that is "probably OK" and we'd usually be right; after all, a backend would crap out without releasing a critical spinlock very often. A lot of users would probably be very happy to be liberated from the tyranny of a server-wide restart every time a backend crashes, and 90% of the time nothing bad would happen. But clearly this is insanity, because every now and then something would go terribly wrong and there would be no automated way for the system to recover, and on even rarer occasions your data would get eaten. That is why it is right to think that the service provided by the postmaster is essential, not nice-to-have. > Ideally we'd have some way to configure the behavior appropriately for > a given installation; but short of that, it's unclear to me that > unilaterally changing the system's bias is something our users would > thank us for. I've not noticed a large groundswell of complaints about > it (though this may just reflect that we've made the postmaster pretty > darn robust, so that the case seldom comes up). I do think that's a large part of it. The postmaster doesn't get killed very often, and when it does, things are often messed up to a degree where the user's just going to reboot anyway. But I've encountered customers who managed to corrupt their database because backends didn't exit when the postmaster died, because it turns out that removing postmaster.pid defeats the shared memory interlocks that normally prevent starting a new postmaster, and the customer did that. And I've personally experienced at least one protracted outage that resulted from orphaned backends preventing 'pg_ctl restart' from working. If the postmaster weren't so reliable, I'm sure these kinds of problems would be a lot more common. But the fact that they're uncommon doesn't mean that the current behavior is the best one, and I'm convinced that it isn't. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers