Re: [HACKERS] On conflict update & hint bits
On Sun, Oct 23, 2016 at 3:42 PM, Tom Lanewrote: > The business about not throwing a serialization failure due to actions > of our own xact does seem worth fixing now, but if I understand correctly, > we can deal with that by adding a test for xmin-is-our-own-xact into > the existing code structure. I propose doing that much (and adding > Munro's regression test case) and calling it good for today. Thanks. This is the only part of it that I consider an actual *bug* (since you can retry the serialization failure forever and never move on because there is no other transaction involved which can finish to clear the problem) as opposed to an opportunity to optimize (by reducing false positive serialization failures actually involving other transactions). I never saw anything in the literature addressing the intersection of UPSERT and SSI, and I think we do need to think about and discuss this a bit more before we can be sure of the best fix. This is probably not thread on which to have that discussion. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
Konstantin Knizhnikwrites: > Just for information: I know that you are working on this issue, but as > far as we need to proceed further with our testing of multimaster, > I have done the following obvious changes and it fixes the problem (at > least this assertion failure is not happen any more): This seems kind of the hard way --- why didn't you put the buffer lock calls into ExecCheckHeapTupleVisible? https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8f1fb7d621b0e6bd2eb0ba2ac9634c5b5a03564b regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
On 24.10.2016 00:49, Peter Geoghegan wrote: On Sun, Oct 23, 2016 at 2:46 PM, Tom Lanewrote: What's bothering me is (a) it's less than 24 hours to release wrap and (b) this patch changes SSI-relevant behavior and hasn't been approved by Kevin. I'm not familiar enough with that logic to commit a change in it on my own authority, especially with so little time for problems to be uncovered. I should point out that I knew that the next set of point releases had been moved forward much later than you did. I had to work on this fix during the week, which was pretty far from ideal for me for my own reasons. Just for information: I know that you are working on this issue, but as far as we need to proceed further with our testing of multimaster, I have done the following obvious changes and it fixes the problem (at least this assertion failure is not happen any more): src/backend/executor/nodeModifyTable.c @@ -1087,6 +1087,13 @@ ExecOnConflictUpdate(ModifyTableState *mtstate, test = heap_lock_tuple(relation, , estate->es_output_cid, lockmode, LockWaitBlock, false, , ); +/* + * We must hold share lock on the buffer content while examining tuple + * visibility. Afterwards, however, the tuples we have found to be + * visible are guaranteed good as long as we hold the buffer pin. + */ +LockBuffer(buffer, BUFFER_LOCK_SHARE); + switch (test) { case HeapTupleMayBeUpdated: @@ -1142,6 +1149,7 @@ ExecOnConflictUpdate(ModifyTableState *mtstate, * loop here, as the new version of the row might not conflict * anymore, or the conflicting tuple has actually been deleted. */ +LockBuffer(buffer, BUFFER_LOCK_UNLOCK); ReleaseBuffer(buffer); return false; @@ -1175,6 +1183,8 @@ ExecOnConflictUpdate(ModifyTableState *mtstate, /* Store target's existing tuple in the state's dedicated slot */ ExecStoreTuple(, mtstate->mt_existing, buffer, false); +LockBuffer(buffer, BUFFER_LOCK_UNLOCK); + /* * Make tuple and any needed join variables available to ExecQual and * ExecProject. The EXCLUDED tuple is installed in ecxt_innertuple, while -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
On Sun, Oct 23, 2016 at 2:46 PM, Tom Lanewrote: > What's bothering me is (a) it's less than 24 hours to release wrap and > (b) this patch changes SSI-relevant behavior and hasn't been approved > by Kevin. I'm not familiar enough with that logic to commit a change > in it on my own authority, especially with so little time for problems > to be uncovered. I should point out that I knew that the next set of point releases had been moved forward much later than you did. I had to work on this fix during the week, which was pretty far from ideal for me for my own reasons. > I'm okay with adding an explicit buffer lock/unlock pair, and in fact > plan to go have a look at that in a bit. I'm not okay with doing a > refactoring that might change the behavior in more ways than that > under these circumstances. Fair enough. As long as we do that much, I'm comfortable. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
Peter Geogheganwrites: > On Sun, Oct 23, 2016 at 1:42 PM, Tom Lane wrote: >> "Rarely" is not "never". A bigger problem though is that heap_fetch >> does more than just lock the buffer: there are also PredicateLockTuple >> and CheckForSerializableConflictOut calls in there. It's possible that >> those are no-ops in this usage (because after all we already fetched >> the tuple once), or maybe they're even desirable because they would help >> resolve Kevin's concerns. But it hasn't been analyzed and so I'm not >> prepared to risk a behavioral change in this already under-tested area >> the day before a release wrap. > I'm surprised at how you've assessed the risk here. What's bothering me is (a) it's less than 24 hours to release wrap and (b) this patch changes SSI-relevant behavior and hasn't been approved by Kevin. I'm not familiar enough with that logic to commit a change in it on my own authority, especially with so little time for problems to be uncovered. I'm okay with adding an explicit buffer lock/unlock pair, and in fact plan to go have a look at that in a bit. I'm not okay with doing a refactoring that might change the behavior in more ways than that under these circumstances. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
On Sun, Oct 23, 2016 at 1:42 PM, Tom Lanewrote: > "Rarely" is not "never". A bigger problem though is that heap_fetch > does more than just lock the buffer: there are also PredicateLockTuple > and CheckForSerializableConflictOut calls in there. It's possible that > those are no-ops in this usage (because after all we already fetched > the tuple once), or maybe they're even desirable because they would help > resolve Kevin's concerns. But it hasn't been analyzed and so I'm not > prepared to risk a behavioral change in this already under-tested area > the day before a release wrap. The heap_fetch() contract doesn't ask its callers to consider any of that. Besides, those actions (PredicateLockTuple + CheckForSerializableConflictOut) are very probably redundant no-ops as you say. (They won't help with Kevin's additional concerns, BTW, because Kevin is concerned about excessive serialization failures with SSI.) > AFAICT, it's very hard to get to an actual failure from the lack of > buffer lock here. You would need to have a situation where the tuple > was not hintable when originally fetched but has become so by the time > ON CONFLICT is rechecking it. That's possible, eg if we're using > async commit and the previous transaction's commit record has gotten > flushed to disk in between, but it's not likely. Even then, no actual > problem would ensue (in non-assert builds) from setting a hint bit without > buffer lock, except in very hard-to-hit race conditions. So the buffer > lock issue doesn't seem urgent enough to me to justify making a fix under > time pressure. > > The business about not throwing a serialization failure due to actions > of our own xact does seem worth fixing now, but if I understand correctly, > we can deal with that by adding a test for xmin-is-our-own-xact into > the existing code structure. I propose doing that much (and adding > Munro's regression test case) and calling it good for today. I'm surprised at how you've assessed the risk here. I think that the risk of not proceeding with fixing the buffer lock issue is greater than the risk of breaking something else with the proposed fix. Even if the former risk isn't such a big one. If there are a non-trivial number of users that are particularly attached to the precise behavior of higher isolation levels in master (assuming it would be altered by the proposed fix), then it's surprising that it took so many months for someone to complain about the ON CONFLICT DO NOTHING behavior being blatantly broken. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
Peter Geogheganwrites: > On Sat, Oct 22, 2016 at 9:38 AM, Tom Lane wrote: >> 1. Why are you replacing ExecOnConflictUpdate's ExecCheckHeapTupleVisible >> call with ExecCheckTIDVisible? That appears to demand a re-fetch of the >> tuple ExecOnConflictUpdate already has its hands on. Wouldn't it be >> better to just put a buffer lock/unlock around the existing code? > I thought that that was less than idiomatic within nodeModifyTable.c > -- executor modules rarely directly acquire buffer locks. "Rarely" is not "never". A bigger problem though is that heap_fetch does more than just lock the buffer: there are also PredicateLockTuple and CheckForSerializableConflictOut calls in there. It's possible that those are no-ops in this usage (because after all we already fetched the tuple once), or maybe they're even desirable because they would help resolve Kevin's concerns. But it hasn't been analyzed and so I'm not prepared to risk a behavioral change in this already under-tested area the day before a release wrap. AFAICT, it's very hard to get to an actual failure from the lack of buffer lock here. You would need to have a situation where the tuple was not hintable when originally fetched but has become so by the time ON CONFLICT is rechecking it. That's possible, eg if we're using async commit and the previous transaction's commit record has gotten flushed to disk in between, but it's not likely. Even then, no actual problem would ensue (in non-assert builds) from setting a hint bit without buffer lock, except in very hard-to-hit race conditions. So the buffer lock issue doesn't seem urgent enough to me to justify making a fix under time pressure. The business about not throwing a serialization failure due to actions of our own xact does seem worth fixing now, but if I understand correctly, we can deal with that by adding a test for xmin-is-our-own-xact into the existing code structure. I propose doing that much (and adding Munro's regression test case) and calling it good for today. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
On Sat, Oct 22, 2016 at 9:38 AM, Tom Lanewrote: > 1. Why are you replacing ExecOnConflictUpdate's ExecCheckHeapTupleVisible > call with ExecCheckTIDVisible? That appears to demand a re-fetch of the > tuple ExecOnConflictUpdate already has its hands on. Wouldn't it be > better to just put a buffer lock/unlock around the existing code? I thought that that was less than idiomatic within nodeModifyTable.c -- executor modules rarely directly acquire buffer locks. More importantly, while the lighter weight ExecCheckHeapTupleVisible() variant seemed to make sense when there was no buffer lock acquisition, now that it's clear that a buffer lock acquisition is necessary, I am skeptical. I now doubt that the added complexity pays for itself, which is why I proposed to remove ExecCheckHeapTupleVisible(). Besides, ON CONFLICT seems like a feature that is significantly less compelling at higher isolation levels; that must be why it took this long to hear a problem report like Konstantin's. > 2. Your proposed coding > > + if (!heap_fetch(rel, estate->es_snapshot, , , true, > NULL)) > + ... > + if > (!TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple.t_data))) > > will dump core in the case where heap_fetch returns false with > tuple.t_data set to null. If that case is impossible here, > it's not apparent why, and it'd surely deserve at least a comment, > maybe an Assert. But I'd rather just assume it's possible. You could say the same thing about at least one other existing place that more or less assumes a conflictTid heap_fetch() or similar cannot allow that to happen, I think. Maybe this is just as good a place to talk about it as any other, though. I defer to you. (It's safe because our snapshot is a kind of interlock against vacuum killing the conflictTid tuple.) > I'm not taking a position on whether this is OK at the higher level > of whether the SSI behavior could be better. However, unless Kevin > has objections to committing something like this now, I think we > should fix the above-mentioned problems and push it. The existing > behavior is clearly bad. I don't have any strong feelings on it myself just yet; I trust that Kevin's judgement that we should do better under SSI is sound. I should have mentioned that Kevin encouraged me to do this much first in my description of the patch. I'm pretty confident that he will have no objection to doing something like this for now. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
Peter Geogheganwrites: > Attached is a revision of Thomas' patch to fix a related issue; it now > also fixes this no-buffer-lock-held issue. I think this needs more work. 1. Why are you replacing ExecOnConflictUpdate's ExecCheckHeapTupleVisible call with ExecCheckTIDVisible? That appears to demand a re-fetch of the tuple ExecOnConflictUpdate already has its hands on. Wouldn't it be better to just put a buffer lock/unlock around the existing code? 2. Your proposed coding + if (!heap_fetch(rel, estate->es_snapshot, , , true, NULL)) + ... + if (!TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple.t_data))) will dump core in the case where heap_fetch returns false with tuple.t_data set to null. If that case is impossible here, it's not apparent why, and it'd surely deserve at least a comment, maybe an Assert. But I'd rather just assume it's possible. I'm not taking a position on whether this is OK at the higher level of whether the SSI behavior could be better. However, unless Kevin has objections to committing something like this now, I think we should fix the above-mentioned problems and push it. The existing behavior is clearly bad. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
On Sat, Oct 1, 2016 at 5:15 AM, Peter Geogheganwrote: > On Fri, Sep 30, 2016 at 5:33 PM, Konstantin Knizhnik > wrote: >> So the question is whether it is correct that ExecOnConflictUpdate tries to >> access and update tuple without holding lock on the buffer? > > You're right -- this is a bug in Postgres. (Thomas Munro and Kevin Grittner added to CC list.) Attached is a revision of Thomas' patch to fix a related issue; it now also fixes this no-buffer-lock-held issue. He posted his version of this to a -general mailing list thread a week ago [1]. This was a thread about spurious serialization failure with ON CONFLICT DO NOTHING. I understand that Kevin is still not happy with the behavior under SSI even with our fix, since serialization failures will still occur more often than necessary (see other thread for details of what I'm talking about). I believe this patch to be a strict improvement on master, and I suggest it be applied in advance of the deadline to get fixes in for the next set of point releases. Note that this revision includes Thomas' isolation test, which is completely unchanged. I still intend to work with Kevin to do better than this under SSI, though that will probably be for a future point release. The behavior proposed by my fix here is the right thing for REPEATABLE READ isolation level, which has nothing to do with Kevin's proposed more careful handling under SSI. Besides, the buffer pin bug reported by Konstantin on this thread really should be addressed ASAP. [1] https://www.postgresql.org/message-id/CAEepm=3ra9ngdhocdbtb4iib7mwdavqybns3f47svkh1mk-...@mail.gmail.com -- Peter Geoghegan From 648de70f226831af04e1d31329118c325161da0b Mon Sep 17 00:00:00 2001 From: Peter Geoghegan Date: Wed, 19 Oct 2016 13:59:20 -0700 Subject: [PATCH] Fix ON CONFLICT bugs at higher isolation levels. When used at higher isolation levels, ON CONFLICT DO NOTHING could raise spurious serialization failure errors. This happened when duplicate values were proposed for insertion within a single ON CONFLICT DO NOTHING command. (Similar ON CONFLICT DO UPDATE cases typically raised valid cardinality violation errors instead.) Separately, though in the same code path, insufficient care was taken when a visibility check was performed on some existing, conflicting tuple. At least a shared buffer lock is required on any buffer containing a tuple whose visibility is checked through tqual.c routines, which may set hint bits (a buffer pin is therefore insufficient). To fix the first issue, check if tuple was created by current transaction as a further condition of raising a serialization failure in relevant test. To fix the second issue, outsource visibility test to preexisting heap_fetch() call. Patch by Thomas Munroe and Peter Geoghegan. First bug reported by Jason Dusek. Second bug reported by Konstantin Knizhnik. Report:
Re: [HACKERS] On conflict update & hint bits
On Sat, Oct 1, 2016 at 1:15 PM, Peter Geogheganwrote: > It also looks like the DO NOTHING variant is similarly affected, even > when the isolation level is READ COMMITTED.:-( Actually, the DO NOTHING variant is also only affected in isolation levels greater than READ COMMITTED. Not sure why I thought otherwise. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
On Fri, Sep 30, 2016 at 5:33 PM, Konstantin Knizhnikwrote: > So the question is whether it is correct that ExecOnConflictUpdate tries to > access and update tuple without holding lock on the buffer? You're right -- this is a bug in Postgres. I'm travelling from Ireland to the USA this weekend, but will work on this early next week. I don't think it's a particularly tricky fix -- as you say, it is necessary to have at least a shared buffer lock to call HeapTupleSatisfiesVisibility(), and we quite simply fail to ensure that. We must have a shared buffer lock in the visibility-check path for ON CONFLICT DO UPDATE where isolation level > READ COMMITTED -- a buffer pin is not enough. It also looks like the DO NOTHING variant is similarly affected, even when the isolation level is READ COMMITTED.:-( Thanks for the detailed report. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
On 30.09.2016 19:37, Peter Geoghegan wrote: On Fri, Sep 30, 2016 at 5:33 PM, Konstantin Knizhnikwrote: Later we try to check tuple visibility: ExecCheckHeapTupleVisible(estate, , buffer); and inside HeapTupleSatisfiesMVCC try to set hint bit. So, you're using repeatable read or serializable isolation level? Repeatable read. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
On Fri, Sep 30, 2016 at 5:33 PM, Konstantin Knizhnikwrote: > Later we try to check tuple visibility: > > ExecCheckHeapTupleVisible(estate, , buffer); > > and inside HeapTupleSatisfiesMVCC try to set hint bit. So, you're using repeatable read or serializable isolation level? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers