On Fri, Mar 18, 2011 at 12:19 PM, Simon Riggs <si...@2ndquadrant.com> wrote: > On Fri, 2011-03-18 at 17:47 +0200, Heikki Linnakangas wrote: >> On 18.03.2011 16:52, Kevin Grittner wrote: >> > Simon Riggs<si...@2ndquadrant.com> wrote: >> > >> >> In PostgreSQL other users cannot observe the commit until an >> >> acknowledgement has been received. >> > >> > Really? I hadn't picked up on that. That makes for a lot of >> > complication on crash-and-recovery of a master, but if we can pull >> > it off, that's really cool. If we do that and MySQL doesn't, we >> > definitely don't want to use the same terminology they do, which >> > would imply the same behavior. >> >> To be clear: other users cannot observe the commit until standby >> acknowledges it - unless the master crashes while waiting for the >> acknowledgment. If that happens, the commit will be visible to everyone >> after recovery. > > No, only in the case where you choose not to failover to the standby > when you crash, which would be a fairly strange choice after the effort > to set up the standby. In a correctly configured and operated cluster > what I say above is fully correct and needs no addendum.
Except it doesn't work that way. If, say, a backend on the master core dumps, the system will perform a crash and restart cycle, and the transaction will become visible whether it's yet been replicated or not. Since we now have a GUC to suppress restart after a backend crash, it's theoretically possible to set up the system so that this doesn't occur, but it'd take quite a bit of work to make it robust and automatic, and it's certainly not the default out of the box. The fundamental problem here is that once you update CLOG and flush the corresponding WAL record, there is no going backward. You can hold the system in some intermediate state where the transaction still holds locks and is excluded from MVCC snapshots, but there's no way to back up. So there are bound to be corner cases where the where the wait doesn't last as long as you want, and stuff leaks out around the edges. It's fundamentally impossible to guarantee that you'll remain in that intermediate state forever - what do you do if a meteor hits the synchronous standby and at the same time you lose power to the master? No amount of configuration will save you from coming back on line with a visible-but-unreplicated transaction. I'm not knocking the system; I think what we have is impressively good. But pretending that corner cases can't happen gets us nowhere. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers