Greg Stark wrote: > So T1 must have happened before TN because it wrote something based > on data as it was before TN modified it. But T0 can see TN but not > T1 so there's no complete ordering between the three transactions > that makes them all make sense. Correct. > The thing is that the database state is reasonable, the database > state is after it would be if the ordering were T1,TN with T0 > happening any time. And the backup state is reasonable, it's as if > it occurred after TN and before T1. They just don't agree. I agree that the database state eventually "settles" into a valid long-term condition in this particular example. The point you are conceding seems to be that the image captured by pg_dump is not consistent with that. If so, I agree. You don't see that as a problem; I do. I'm not sure where we go from there. Certainly that is better than making pg_dump vulnerable to serialization failure -- if we don't implement the SERIALIZABLE READ ONLY DEFERRABLE transactions I was describing, we can change pg_dump to use REPEATABLE READ and we will be no worse off than we are now. The new feature I was proposing was that we create a SERIALIZABLE READ ONLY DEFERRABLE transaction style which would, rather than acquiring predicate locks and watching for conflicts, potentially wait until it could acquire a snapshot which was guaranteed to be conflict-free. In the example discussed on this thread, if we changed pg_dump to use such a mode, when it went to acquire a snapshot it would see that it overlapped T1, which was not READ ONLY, which in turn overlapped TN, which had written to a table and committed. It would then block until completion of the T1 transaction and adjust its snapshot to make that transaction visible. You would now have a backup entirely consistent with the long-term state of the database, with no risk of serialization failure and no bloating of the predicate lock structures. The only down side is that there could be blocking when such a transaction acquires its snapshot. That seems a reasonable price to pay for backup integrity. Obviously, if we had such a mode, it would be trivial to add a switch to the pg_dump command line which would let the user choose between guaranteed dump integrity and guaranteed lack of blocking at the start of the dump. -Kevin
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers