The fact that PostgreSQL implements MERGE and “INSERT … ON CONFLICT” (so-called 
UPSERT) differently with regard to consistency is an implementation detail in 
PostgreSQL, and I refuse to accept it as rationale for what we do in Calcite. 
(If we try to emulate the behavior of systems that suck… well, it’s a race to 
mediocrity.)

But I do accept that if we parse an UPSERT command, we should not throw away 
the information in the SqlNode or RelNode representations.

I think we should do what PostgreSQL has done[1]: add a ‘conflictBehavior’ 
field, whose value is one of 3 enum values: ‘FAIL’ or ‘DO_NOTHING’ or ‘UPDATE’, 
or something. So, the UPSERT command we implemented for Phoenix would be 
"INSERT … ON CONFLICT UPDATE”.

But we should not add complex actions like "ON CONFLICT (did) DO UPDATE SET 
dname = EXCLUDED.dname” to INSERT. If people want complex actions, they should 
use Calcite's MERGE command. If you want to translate a Calcite MERGE to a 
PostgreSQL ‘INSERT … ON CONFLICT UPDATE …’, have at it.

By the way, I don’t like the word ‘UPSERT’. It probably started an in-joke 
among kernel developers. PostgreSQL doesn’t use it in their SQL (only in their 
documentation), and neither should we. 

We had to add UPSERT to support Phoenix’s dialect (because Phoenix inherited 
limitations on consistency from the underlying HBase engine) but we should 
deprecate UPSERT and move to INSERT … ON CONFLICT instead.

Julian

[1] https://www.postgresql.org/docs/current/sql-insert.html 
<https://www.postgresql.org/docs/current/sql-insert.html>

> On Mar 18, 2020, at 2:07 AM, Enrico Olivelli <[email protected]> wrote:
> 
> Il Mer 18 Mar 2020, 08:35 Christian Beikov <[email protected] 
> <mailto:[email protected]>> ha
> scritto:
> 
>> I'd argue that these technical details are in fact the reason why this
>> is a different functionality, that deserves special handling.
>> 
>> I expect an upsert statement like `INSERT INTO tbl(a, b) VALUES(1, 2) ON
>> CONFLICT DO NOTHING` to never produce a constraint violation.
>> 
>> This is functionally different from a statement like `INSERT INTO tbl(a,
>> b) SELECT a, b (VALUES(1, 2)) x(a, b) WHERE NOT EXISTS(SELECT 1 FROM tbl
>> y WHERE x.a = y.a AND x.b = y.b)` which might cause the constraint
>> violation.
>> 
>> On the other hand, an upsert statement like `INSERT INTO tbl(a, b)
>> VALUES(1, 2) ON CONFLICT DO UPDATE` guarantees that at the end of the
>> statement, there is the tuple `(1, 2)`. There are other variants that
>> provide different functionality(conflict handler conditions, special
>> update clause, etc.) and overall, for DBMS that do not support the ON
>> CONFLICT clause, it is necessary to fallback to PL/SQL to handle
>> constraint violations in exception handlers within loops to ensure the
>> same behvaior.
>> 
>> If Calcite transforms such an UPSERT statement to a MERGE statement, it
>> must at least be flagged to require atomicity to be able to generate the
>> correct logic for the backend that this is running on.
>> 
> 
> Please don't do this.
> They are different features
> MERGE is more powerful and it is also for moving data from a table to
> another one (but I have never used MERGE)
> 
> Enrico
> 
> 
>> Am 17.03.2020 um 22:28 schrieb Julian Hyde:
>>> I don't think there's a significant difference between the UPSERT and
>>> MERGE. The differences are in marketing (which keyword to use) and in
>>> technical details (e.g. concurrency semantics). Not worth splitting a
>>> core concept over. We spend a lot of effort keeping like-things-like.
>>> 
>>> On Tue, Mar 17, 2020 at 1:11 AM Christian Beikov
>>> <[email protected]> wrote:
>>>> AFAIK MERGE has different concurrency semantics than what some DBMS call
>>>> UPSERT. PostgreSQL for example has a guaranteed insert or update
>>>> semantics whereas MERGE could end with constraint violation errors:
>>>> https://wiki.postgresql.org/wiki/UPSERT
>>>> 
>>>> Maybe it's worth adding that to the relational model after all?
>>>> 
>>>> Am 17.03.2020 um 07:17 schrieb Enrico Olivelli:
>>>>> Il Lun 16 Mar 2020, 23:55 Julian Hyde <[email protected]> ha
>> scritto:
>>>>> 
>>>>>> Change the unparse operation for the dialect so that you generate
>> UPSERT
>>>>>> rather than MERGE.
>>>>>> 
>>>>>> IIRC we did this for another dialect - maybe Phoenix.
>>>>>> 
>>>>> Julian,
>>>>> In my case (HerdDB) I need to see the presence of UPSERT in the RelNode
>>>>> (EnumerableTableModify oŕ LogicalTableModify)
>>>>> I saw the JIRA where you introduced UPSERT for Phoenix
>>>>> I will check deeper, thanks
>>>>> 
>>>>> Enrico
>>>>> 
>>>>> 
>>>>> 
>>>>>> Julian
>>>>>> 
>>>>>>> On Mar 16, 2020, at 1:22 AM, Enrico Olivelli <[email protected]>
>>>>>> wrote:
>>>>>>> Il Lun 16 Mar 2020, 09:06 Stamatis Zampetakis <[email protected]>
>> ha
>>>>>>> scritto:
>>>>>>> 
>>>>>>>> Hi Enrico,
>>>>>>>> 
>>>>>>>> I have the impression that UPSERT is currently supported only at the
>>>>>> parser
>>>>>>>> level [1] so it seems normal that you don't find something relevant
>> in
>>>>>>>> LogicalTableModify etc. Note that the SQL standard equivalent is the
>>>>>> MERGE
>>>>>>>> statement [2] but this also seems to be supported only at the
>>>>>>>> parser/validator level [2].
>>>>>>>> I guess it is not necessary to introduce more things in TableModify
>>>>>> since
>>>>>>>> UPSERT seems to be syntactic sugar. I think that most of the work
>> can be
>>>>>>>> done in RelToSqlConverter [4] and possibly the rest of the code can
>> be
>>>>>> left
>>>>>>>> intact.
>>>>>>>> 
>>>>>>> I would like to sens a patch that introduces the ability to keep the
>>>>>>> SqlInsert 'keywords' and pass them to TableModify?
>>>>>>> Would it be a good approach?
>>>>>>> 
>>>>>>> The alternative is to introduce a new Operation but it would be a
>> more
>>>>>>> invasive change and I think it is not worth
>>>>>>> 
>>>>>>> 
>>>>>>> Enrico
>>>>>>> 
>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Stamatis
>>>>>>>> 
>>>>>>>> [1] https://issues.apache.org/jira/browse/CALCITE-492
>>>>>>>> [2] https://en.wikipedia.org/wiki/Merge_(SQL)
>>>>>>>> [3] https://issues.apache.org/jira/browse/CALCITE-985
>>>>>>>> [4]
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>> https://github.com/apache/calcite/blob/d234626227954eefffe49f42abec65c649ffe3a6/core/src/test/java/org/apache/calcite/test/SqlToRelConverterTest.java#L2395
>>>>>>>>> On Sun, Mar 15, 2020 at 6:53 PM Enrico Olivelli <
>> [email protected]>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> Hi,
>>>>>>>>> I am trying to use UPSERT but it seems to me that in TableModify
>> (or
>>>>>>>>> best LogicalTableModify or EnumerableTableModify) there is no way
>> to
>>>>>>>>> distinguish an INSERT from an UPSERT.
>>>>>>>>> 
>>>>>>>>> Am I missing something?
>>>>>>>>> 
>>>>>>>>> Regards
>>>>>>>>> Enrico

Reply via email to