Re: [HACKERS] MERGE SQL Statement for PG11

Peter Geoghegan Tue, 30 Jan 2018 11:29:11 -0800

On Tue, Jan 30, 2018 at 8:27 AM, Robert Haas <robertmh...@gmail.com> wrote:
> As far as I am able to understand, the substantive issue here is what
> to do when we match an invisible tuple rather than a visible tuple.
> The patch currently throws a serialization error on the basis that you
> (Simon) thought that's what was previously agreed.  Peter is arguing
> that we don't normally issue a serialization error at READ COMMITTED
> (which I think is true) and proposed that we instead try to INSERT.  I
> don't necessarily think that's different from consensus to implement
> option #3 from 
> https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com
> because that point #3 says that we're not going to try to AVOID errors
> under concurrency, not that we're going to create NEW errors.


> In other words, I understand Peter, then and now, to be saying that MERGE
> should behave just as if invisible tuples didn't exist at all; if that
> leads some other part of the system to throw an ERROR, then that's
> what happens.

Yes, I am still saying that.

What's at issue here specifically is the exact behavior of
EvalPlanQual() in the context of having *multiple* sets of WHEN quals
that need to be evaluated one at a time (in addition to conventional
EPQ join quals). This is a specific, narrow question about the exact
steps that are taken by EPQ when we have to switch between WHEN
MATCHED and WHEN NOT MATCHED cases *as we walk the UPDATE chain*.

Right now, I suspect that we will require some minor variation of
EPQ's logic to account for new risks. The really interesting question
is what happens when we walk the UPDATE chain, while reevaluating EPQ
quals alongside WHEN quals, and then determine that no UPDATE/DELETE
should happen for the first WHEN case -- what then? I suspect that we
may not want to start from scratch (from the MVCC-visible tuple) as we
reach the second or subsequent WHEN case, but that's a very tentative
view, and I definitely want to hear more opinions it. (Simon wants to
just throw a serialization error here instead, even in READ COMMITTED
mode, which I see as a cop-out.)

Note in particular that this EPQ question has nothing to do with
seeing tuples that are not either visible to our MVCC snapshot, or
visible to EPQ through an UPDATE chain (which starts from the MVCC
visible tuple). The idea that I have done some kind of about-face on
how concurrency should work is just plain wrong. It is not a helpful
way of framing things. What I am talking about here is very
complicated, but also really narrow.

> Presumably, in a case like this, that would be a common
> outcome, because the merge would be performed on the basis of a unique
> key and so inserting would trigger a duplicate key violation.  But
> maybe not, because I don't think MERGE requires there to be a unique
> key on that column, so maybe the insert would just work, or maybe the
> conflicting transaction would abort just in time to let it work
> anyway.

I think that going on to INSERT having decided against an UPDATE only
having done an EPQ walk (rather than throwing a serialization error)
is very likely to result in the INSERT succeeding, actually. But there
is no guarantee that you won't get a duplicate violation, because
there is nothing to stop a concurrent *INSERT* with the same PK value.
(That's something that's *always* true, regardless of whether or not
somebody needs to do EPQ.)

-- 
Peter Geoghegan

Re: [HACKERS] MERGE SQL Statement for PG11

Reply via email to