Hi hackers Here is an experimental WIP patch to allow SERIALIZABLE READ ONLY DEFERRABLE transactions on standby servers without serialisation anomalies, based loosely on an old email from Kevin Grittner[1]. I'm not sure how far this is from what he had in mind or whether I've misunderstood something fundamental here, but I hope this can at least serve as a starting point and we can try to get something into Postgres 10.
The patch works by teaching the standby how to do somethings similar to what SERIALIZABLE READ ONLY DEFERRABLE does on the primary server, with some help from the primary server in the form of extra information in the WAL. The basic idea is: the standby should wait until a point in the transaction stream where it can take a snapshot and either (1) there were no concurrent read/write SERIALIZABLE transactions running on the primary, or (2) the last concurrent read/write SERIALIZABLE transaction at snapshot time has now ended without creating a dangerous cycle with our transaction. In case (1), the primary sometimes adds an extra xl_xact_snapshot_safety struct to commit messages which says 'a snapshot taken after this commit and before any future SSI commits is safe, because there are no concurrent read/write SSI transactions at this moment'. In case (2), the xl_xact_snapshot_safety struct embedded in a commit record instead says 'a snapshot taken after this commit and before any future SSI commits is of unknown safety, because there are concurrent transactions; I'll tell you when it has been determined; please remember this token'. The token (which happens to be a CSN but that is not important) will appear in a future independent snapshot safety message which says whether a snapshot taken at that time is safe or unsafe. Note that xl_xact_snapshot_safety is embedded in the commit messages (for SSI transactions only), but also appears as its own WAL record to report the final state of a token from an earlier commit. So if you do a lot of non-overlapping writable SSI transactions, you'll get just a few extra bytes in each commit record, but overlapping transactions will generate a stream of extra snapshot safety messages, one for each commit involved. In order to generate follow-up snapshot safety messages, the patch creates 'hypothetical' transactions on the primary whenever a writeable SSI transaction commits, so that it can figure out whether such a transaction would conflict. These phantom transactions are proxies for any transaction that may be created on a standby at the same point in the transaction stream (with respect to SSI commits) on any standby, and survive in memory just until they are found to be safe or unsafe. Example of use: T1 on primary: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; T1 on primary: INSERT INTO foo VALUES ('x'); T2 on primary: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; T2 on primary: INSERT INTO foo VALUES ('x'); T2 on primary: COMMIT; T3 on standby: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE READ ONLY DEFERRABLE; T3 on standby: SELECT * FROM foo; T3 on standby: <...waits...> T1 on primary: COMMIT; T3 on standby: <...continues...> Not tested much and certainly has bugs and many details to sort out, but first... is this sound or could it be made so? Is there a better way? [1] https://www.postgresql.org/message-id/4D3735E30200002500039869%40gw.wicourts.gov -- Thomas Munro http://www.enterprisedb.com
ssi-standby-v1.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers