[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE

Dan Burkert (JIRA) Wed, 24 May 2017 11:10:15 -0700

    [ 
https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023372#comment-16023372
 ]


Dan Burkert commented on KUDU-1563:
-----------------------------------

Just learned about a usecase that would be well-served by an {{ON DUPLICATE KEY 
UPDATE}} mechanism in Kudu.  In particular, the workload is ingesting batches 
of timestamped records, with each record being quite large.  Individual batches 
routinely contain duplicate records whose contents only differ by collection 
timestamp.  Ideally as new batches are ingested, duplicate records would update 
the collection timestamp column, but skip updating the larger data columns.

To do this effectively, we could have a duplicate-resolution strategy that 
updates individual columns to new values, effectively {{ON DUPLICATE KEY 
UPDATE}} with only constants allowed as the update value.  To be efficient, and 
to map well to SQL, this should probably be specified once on the entire batch 
instead of on individual ops.

> Add support for INSERT IGNORE
> -----------------------------
>
>                 Key: KUDU-1563
>                 URL: https://issues.apache.org/jira/browse/KUDU-1563
>             Project: Kudu
>          Issue Type: New Feature
>            Reporter: Dan Burkert
>            Assignee: Brock Noland
>              Labels: newbie
>
> The Java client currently has an [option to ignore duplicate row key errors| 
> https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-],
>  which is implemented by filtering the errors on the client side.  If we are 
> going to continue to support this feature (and the consensus seems to be that 
> we probably should), we should promote it to a first class operation type 
> that is handled on the server side.  This would have a modest perf. 
> improvement since less errors are returned, and it would allow INSERT IGNORE 
> ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE

Reply via email to