Simon Riggs <si...@2ndquadrant.com> writes: > On 23 April 2017 at 00:55, Tom Lane <t...@sss.pgh.pa.us> wrote: >> It's not clear to me how much potential this has to create user data >> corruption, but it doesn't look good at first glance. Discuss.
> SubTransSetParent() only matters for the case where we are half-way > through a commit with a large commit. Since we do atomic updates of > commit and subcommmit when on same page, this problem would only > matter when top level xid and other subxids were separated across > multiple pages and the issue would only affect a read only query > checking visibility during the commit for that prepared transaction. > Furthermore, the nature of the corruption is that we set the xid to > point to the subxid; since we never mark a top-level commit as > subcommitted, subtrans would never be consulted and so the corruption > would not lead to any incorrect answer even during the window of > exposure. So it looks to me like this error shouldn't cause a problem. Still trying to wrap my head around this argument ... I agree that incorrectly setting the parent's pg_subtrans entry can't cause a visible problem, because it will never be consulted. However, failing to set the child's entry seems like it could cause transient problems. There would only be a risk during the eventual commit of the prepared transaction, and only when the pg_xact entries span multiple pages, but then there would be a window where the child xact is marked subcommitted but it has a zero entry in pg_subtrans. That would result in a WARNING from TransactionIdDidCommit, and a false result, which depending on timing might mean that commit of the overall transaction appears non-atomic to onlookers. (That is, the parent xact might already appear committed to them, but the subtransaction not yet.) I can't find any indication in the archives that anyone's ever reported seeing that WARNING, though that might only mean that they'd not noticed it. But in any case, it seems like the worst consequence is a narrow window for seeing non-atomic commit of a prepared transaction on a standby server. That seems pretty unlikely to cause real-world problems. The other point about overwriteOK not being set when it needs to be also seems like it could not cause any problems in production builds, since that flag is only examined by an Assert. regards, tom lane -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers