On Tue, Nov 8, 2016 at 5:56 PM, Thomas Munro <thomas.mu...@enterprisedb.com> wrote: > Here is an experimental WIP patch to allow SERIALIZABLE READ ONLY > DEFERRABLE transactions on standby servers without serialisation > anomalies, based loosely on an old email from Kevin Grittner[1]. I'm > not sure how far this is from what he had in mind or whether I've > misunderstood something fundamental here, but I hope this can at least > serve as a starting point and we can try to get something into > Postgres 10.
While out walking I realised what was wrong with that. It's going to take me a while to find the time to get back to this, so I figured I should share this realisation in case anyone else is interested in the topic. The problem is that it determines snapshot safety in PreCommit_CheckForSerializationFailure, and then races other backends to XactLogCommitRecord. It could determine that a hypothetical snapshot taken after this commit is safe, but then other activity resulting in a hypothetical snapshot of unknown safety could happen and be logged before we record our determination in the log. One solution could be to serialise XactLogCommitRecord for SSI transactions using SerializableXactHashLock, and determine hypothetical snapshot safety at the same time, so that commit replay order matches safety determination order. But it would suck to add another point of lock contention to SSI commits. Another solution could be to have recovery on the standby detect tokens (CSNs incremented by PreCommit_CheckForSerializationFailure) arriving out of order, but I don't know what exactly it should do about that when it is detected: you shouldn't respect an out-of-order claim of safety, but then what should you wait for? Perhaps if the last replayed commit record before that was marked SNAPSHOT_SAFE then it's OK to leave it that way, and if it was marked SNAPSHOT_SAFETY_UNKNOWN then you have to wait for that one to be resolved by a follow-up snapshot safety message and then rince-and-repeat (take a new snapshot etc). I think that might work, but it seems strange to allow random races on the primary to create extra delays on the standby. Perhaps there is some much simpler way to do all this that I'm missing. Another detail is that standbys that start up from a checkpoint and don't see any SSI transactions commit don't yet have any snapshot safety information, but defaulting to assuming that this point is safe doesn't seem right, so I suspect it needs to be in checkpoints. Attached is a tidied up version which doesn't try to address the above problems yet. When time permits I'll come back to this. -- Thomas Munro http://www.enterprisedb.com
ssi-standby-v2.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers