On Thu, Jan 19, 2017 at 8:11 PM, Ants Aasma <ants.aa...@eesti.ee> wrote: > On Tue, Jan 3, 2017 at 3:43 AM, Thomas Munro > <thomas.mu...@enterprisedb.com> wrote: >> Long term, I think it would be pretty cool if we could develop a set >> of features that give you distributed sequential consistency on top of >> streaming replication. Something like (this | causality-tokens) + >> SERIALIZABLE-DEFERRABLE-on-standbys + >> distributed-dirty-read-prevention. > > Is it necessary that causal writes wait for replication before making > the transaction visible on the master? I'm asking because the per tx > variable wait time between logging commit record and making > transaction visible makes it really hard to provide matching > visibility order on master and standby.
Yeah, that does seem problematic. Even with async replication or no replication, isn't there already a race in CommitTransaction() where two backends could reach RecordTransactionCommit() in one order but ProcArrayEndTransaction() in the other order? AFAICS using synchronous replication in one of the transactions just makes it more likely you'll experience such a visibility difference between the DO and REDO histories (!), by making RecordTransactionCommit() wait. Nothing prevents you getting a snapshot that can see t2 but not t1 in the DO history, while someone doing PITR or querying an asynchronous standby gets a snapshot that can see t1 but not t2 because those replay the REDO history. > In CSN based snapshot > discussions we came to the conclusion that to make standby visibility > order match master while still allowing for async transactions to > become visible before they are durable we need to make the commit > sequence a vector clock and transmit extra visibility ordering > information to standby's. Having one more level of delay between wal > logging of commit and making it visible would make the problem even > worse. I'd like to read that... could you please point me at the right bit of that discussion? > One other thing that might be an issue for some users is that this > patch only ensures that clients observe forwards progress of database > state after a writing transaction. With two consecutive read only > transactions that go to different servers a client could still observe > database state going backwards. True. This patch is about "read your writes", not, erm, "read your reads". That may indeed be problematic for some users. It's not a very satisfying answer but I guess you could run a dummy write query on the primary every time you switch between standbys, or before telling any other client to run read-only queries after you have done so, in order to convert your "r r" sequence into a "r w r" sequence... > It seems that fixing that would > require either keeping some per client state or a global agreement on > what snapshots are safe to provide, both of which you tried to avoid > for this feature. Agreed. You briefly mentioned this problem in the context of pairs of read-only transactions a while ago. As you said then, it does seem plausible to do that with a token system that gives clients the last commit LSN from the snapshot used by a read only query, so that you can ask another standby to make sure that LSN has been replayed before running another read-only transaction. This could be handled explicitly by a motivated client that is talking to multiple nodes. A more general problem is client A telling client B to go and run queries and expecting B to see all transactions that A has seen; it now has to pass the LSN along with that communication, or rely on some kind of magic proxy that sees all transactions, or a radically different system with a GTM.  https://www.postgresql.org/message-id/ca%2bcsw_u4vy5fsbjvc7qms6puzl7qv90%2bonbetk9pfqosnj0...@mail.gmail.com -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers