Simon Riggs wrote: > Greg Stark <st...@mit.edu> wrote: >> Tom Lane <t...@sss.pgh.pa.us> wrote: >>> Isn't there an even more serious problem, namely that this >>> assumes *all* transactions are serializable?
Do you mean in terms of the serializable transaction isolation level, or something else? I haven't read the patches, but I've been trying to follow the discussion and I don't recall any hint of basing this on serializable transactions on each source. Of course, when it comes down to commits, both where a change is committed and where the work is copied, there must be a commit order; and with asynchronous work where data isn't partitioned such that there is a single clear owner for each partition there will be conflicts which must be resolved. I don't get the impression that this point has been lost on Simon and Andres. >>> What happens when they aren't? Or even just that the effective >>> commit order is not XID order? >> >> Firstly, I haven't read the code but I'm confident it doesn't make >> the elementary error of assuming commit order == xid order. I >> assume it's applying the reassembled transactions in commit order. Same here. >> I don't think it assumes the transactions are serializable because >> it's only concerned with writes, not reads. So the transaction >> it's replaying may or may not have been able to view the data >> written by other transactions that commited earlier but it doesn't >> matter when trying to reproduce the effects using constants. IIRC, the developers of this feature have explicitly said that they will defer any consideration of trying to extend serializable transaction isolation behavior to a multi-server basis until after they have other things working. (Frankly, to do otherwise would not be sane.) It appears to me that it can't be managed in a general sense without destroying almost all the advantages of multi-master replication, at least (as I said before) where data isn't partitioned such that there is a single clear owner for each partition. Where such partitioning is present and there are data sets maintained exclusively by serializable transactions, anomaly-free reads of the data could be accomplished by committing transactions on the replicas in "apparent order of execution" rather than "commit order". Apparent order of execution must take both commit order and read-write dependencies into consideration. >> The data written by this transaction is either written or not when >> the commit happens and it's all written or not at that time. Even >> in non-serializable mode updates take row locks and nobody can see >> the data or modify it until the transaction commits. As with read-only transactions and hot standbys, the problem comes in when a transaction commits and is replicated while a transction remains uncommitted which is basing its updates on the earlier state of the data. It gets even more exciting with MMR since the transaction working with the old version of the data might be on a different machine, on another continent. With certain types of workloads, it seems to me that it could get pretty crazy if certain closely-related actions are not kept within a single database (like controlling the active batch and adding items to a batch). In the "wild, half-baked, hand-wavey suggestions" department -- maybe there should be some consideration of a long-term way within MMR to direct activities to certain logical nodes, each of which could be mapped to a single physical node at any one time. Basically, to route a request through the MMR network to the current logical node for handling something, and have the effects ripple back out through all nodes. > This uses Commit Serializability, which is valid, as you say. Well, it is a type of concurrency control with a name and a definition, if that's what you mean. I agree it provides sufficient guarantees to create a workable MMR system, if you have adequate conflict resolution. My concern is that it not be confused with serializability in the mathematical sense or in the sense of transaction isolation levels. In general on this thread, when I've seen the terms "serializable" and "serializability" I haven't been clear on whether the words are being used in their more general sense as words in the English language, or in a more particular technical sense. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers