On Wed, Feb 28, 2018 at 8:53 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Tue, Feb 27, 2018 at 5:07 PM, Peter Geoghegan <p...@bowt.ie> wrote:
>> I now feel like Simon's suggestion of throwing an error in corner
>> cases isn't so bad. It still seems like we could do better, but the
>> more I think about it, the less that seems like a cop-out. My reasons
>> are:
>
> I still think we really ought to try not to add a new class of error.

I do too. However, it's only fair to acknowledge that the lack of any
strong counterproposal after a month or so tells us something.

>> * We can all agree that *not* raising an error in the specific way
>> Simon proposes is possible, somehow or other. We also all agree that
>> avoiding the broader category of RC errors can only be taken so far
>> (e.g. in any event duplicate violations errors are entirely possible,
>> in RC mode, when a MERGE inserts a row). So this is a question of what
>> exact middle ground to take. Neither of the two extremes (throwing an
>> error on the first sign of a RC conflict, and magically preventing
>> concurrency anomalies) are actually on the table.
>
> Just because there's no certainty about which behavior is best doesn't
> mean that none of them are better than throwing an error.

Bear in mind that the patch doesn't throw an error at the first sign
of trouble. It throws an error after doing an EPQ style walk, which
individually verifies the extra "WHEN ... AND" quals for every tuple
in the update chain (ExecUpdate()/ExecDelete() return after first EPQ
tuple check for MERGE, so that an ExecMerge() caller can do that extra
part with special EPQ slot). The controversial serialization error
only comes when the original basic assessment of MATCHED needs to
change, without regard to extra "WHEN ... AND" quals (and only when
the MERGE statement was written in such a way that that matters -- it
must have a WHEN NOT MATCHED clause, too).

AFAICT, the patch will only throw an error when the join quals no
longer pass -- not when the extra "WHEN ... AND" quals no longer pass.
That would be far worse, IMV. ExecMerge() will retry until it reaches
the end of the update chain, possibly changing its mind about which
particular WHEN case applies repeatedly, going back and forth between
mergeActions (WHEN cases) as the update chain in walked. The patch can
do a lot to roll with conflicts before it gives up. Conflicts are
generally caused by concurrent DELETEs, so you could say that this
offers a kind of symmetry with concurrent INSERTs causing duplicate
violation errors. (Concurrent UPDATEs that change the join qual values
could provoke the error too, but presumably we're joining on the PK in
most cases, so that seems much less likely than a simple DELETE.)

Bear in mind that both the SQL Server and Oracle implementations have
many significant restrictions/caveats around doing something cute with
the main join quals. It's not surprising that we'd have something like
that, too. According to Rahila, Oracle doesn't allow MERGE to update
the columns in the join qual at all, and only allows a single WHEN
MATCHED case (a DELETE can only come after an UPDATE in Oracle, oddly
enough). SQL Server's docs have a warning that states: "Do not attempt
to improve query performance by filtering out rows in the target table
in the ON clause, such as by specifying AND NOT target_table.column_x
= value. Doing so may return unexpected and incorrect results.". What
does that even mean!?

I am slightly concerned that the patch may still have livelock hazards
in its approach to RC conflict handling. I haven't studied that aspect
in enough detail, though. Since we always make forward progress by
following an update chain, any problem here is probably fixable.

-- 
Peter Geoghegan

Reply via email to