> I think we would need to fire invalidations at COMMIT PREPARED, yet > logically decode them at PREPARE. > Yes, we need invalidations to logically decode at PREPARE and then we need invalidations to be executed at COMMIT PREPARED time as well.
DecodeCommit() needs to know when it's processing a COMMIT PREPARED whether this transaction was decoded at PREPARE time.The main issue is that we cannot expect the ReorderBufferTXN structure which was created at PREPARE time to be around till the COMMIT PREPARED gets called. The patch earlier was not cleaning this structure at PREPARE and was adding an is_prepared flag to it so that COMMIT PREPARED knew that it was decoded at PREPARE time. This structure can very well be not around when you restart between PREPARE and COMMIT PREPARED, for example. So now, it's the onus of the prepare filter callback to always give us the answer if a given transaction was decoded at PREPARE time or not. We now hand over the ReorderBufferTxn structure (it can be NULL), xid and gid and the prepare filter tells us what to do. Always. The is_prepared flag can be cached in the txn structure to aid in re-lookups, but if it's not set, the filter could do xid lookup, gid inspection and other shenanigans to give us the same answer every invocation around. Because of the above, we can very well cleanup the ReorderBufferTxn at PREPARE time and it need not hang around till COMMIT PREPARED gets called, which is a good win in terms of resource management. My test cases pass (including the scenario described earlier) with the above code changes in place. I have also added crash testing related TAP test cases, they uncovered a bug in the prepare redo restart code path which I fixed. I believe this patch is in very stable state now. Multiple runs of the crash TAP test pass without issues. Multiple runs of "make check-world" with cassert enabled also pass without issues. Note that this patch does not contain the HeapTupleSatisfiesVacuum changes. I believe we need changes to HeapTupleSatisfiesVacuum given than logical decoding changes the assumption that catalog tuples belonging to a transaction which never committed can be reclaimed immediately. With 2PC logical decoding or streaming logical decoding, we can always have a split time window in which the ongoing decode cycle needs those tuples. The solution is that even for aborted transactions, we do not return HEAPTUPLE_DEAD if the transaction id is newer than the OldestXmin (same logic we use for deleted tuples of committed transactions). We can do this only for catalog table rows (both system and user defined) to limit the scope of impact. In any case, this needs to be a separate patch along with a separate discussion thread. Peter, I will submit a follow-on patch with documentation changes soon. But this patch is complete IMO, with all the required 2PC logical decoding functionality. Comments, feedback is most welcome. Regards, Nikhils -- Nikhil Sontakke http://www.2ndQuadrant.com/ PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
2pc_logical_19_12_17_without_docs.patch
Description: Binary data