[jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct

James Taylor (JIRA) Mon, 10 Oct 2016 12:40:50 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563263#comment-15563263
 ]


James Taylor commented on PHOENIX-6:
------------------------------------

I'm exploring an alternate syntax (with equivalent functionality) so that when 
we move to Calcite there will be no need for users to make any changes. Calcite 
supports the MERGE statement which is a superset of this functionality. I'm 
trying to come up with the minimal support for MERGE we'd need. 
{code}
MERGE MyTable AS t
USING (VALUES('a','b',0,0)) AS s(pk1,pk2,counter1,counter2)
ON (t.pk1,t.pk2) = (s.pk1, s.pk2)
WHEN MATCHED THEN UPDATE SET t.counter1 = s.counter1 + 1, t.counter2 = 
s.counter2 + 1
WHEN NOT MATCHED THEN INSERT(pk1,pk2,counter1,counter2) VALUES('a', 'b', 0, 0)
{code}
Not sure how restrict/scope down the ON clause as the only construct allowed 
there would be a PK match. The verbosity of the MERGE statement is not ideal 
for this simple use case, as the ON DUPLICATE KEY  construct only has to define 
the WHEN MATCHED part. Also, it's unfortunate that the VALUES clause has to be 
repeated. Is there any way around that, [~julianhyde]? 

IMHO, the above is harder to read and will be harder to implement too. Perhaps 
we can support the ON DUPLICATE KEY syntax and generate a MERGE statement from 
it when the Phoenix/Calcite integration is complete.

Thoughts?

> Support ON DUPLICATE KEY construct
> ----------------------------------
>
>                 Key: PHOENIX-6
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: James Taylor
>            Assignee: James Taylor
>             Fix For: 4.9.0
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> <column>=<expression>, ...] ]
> {code}
> The following restrictions will apply:
> - The <column> may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> To handle the maintenance of immutable indexes, we'll need to push the 
> maintenance to the server side.
> This is because the mutations for indexes on immutable tables are calculated 
> on the client-side, while this new syntax would potentially modify the value 
> on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct

Reply via email to