[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824440#comment-16824440 ] Alexey Serbin commented on KUDU-1563: - [~adar], I don't have any thoughts on this issue yet, but I'll try to take a closer look this week to get more context. > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland >Priority: Major > Labels: backup, newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823678#comment-16823678 ] Adar Dembo commented on KUDU-1563: -- bq. I don't know how to make the API compatible. [~aserbin], you dealt with KuduWriteOperation's non-PIMPLness recently; do you have any thoughts on this issue? > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland >Priority: Major > Labels: backup, newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823344#comment-16823344 ] Brock Noland commented on KUDU-1563: I'd love to work on this but I don't know how to make the API compatible and am resource constrained. If someone wants to work on it, go right ahead. > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland >Priority: Major > Labels: backup, newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821303#comment-16821303 ] Grant Henke commented on KUDU-1563: --- This would be a useful optimization for full restore (via Spark) optimizations. Right now we use UPSERT in case a spark task needs to be retried, but in the case of a failed Spark task that means we UPSERT all the rows that previously succeeded again. > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland >Priority: Major > Labels: backup, newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724448#comment-16724448 ] Adar Dembo commented on KUDU-1563: -- bq. What would be implications be if we could not do this in a backwards compatible way? It really depends on the incompatibility itself. IIRC, your draft adds a new data member to {{KuduWriteOperation}}. This class is allocated by libkudu_client.so via calls like {{KuduTable::NewInsert}}, and, most of the time, deallocated by libkudu_client.so after it has taken ownership of the operation in {{KuduSession::Apply}} and sent it on the wire. However, if the operation failed, it'll be assigned to a {{KuduError}} and passed back to the third party application, and the application can choose to take ownership of it via {{KuduError::release_failed_op}}. At that point, the application is on the hook for deallocating it, and if the application and libkudu_client.so don't agree on the size and layout of the class, memory will get corrupted by the deallocation. In short, the severity and impact of the incompatibility varies on a case by case basis, and is pretty difficult to assess thoroughly, which is why I'd err on the side of either "don't do it", or "do it, and rev the client SONAME's major version to express the incompatibility". bq. I am not sure how to solve the PIMPL'ed issue either but I am happy to investigate. If you could isolate the changes to just new classes/subclasses (i.e. just a new {{KuduInsertIgnore}} subclass of {{KuduWriteOperation}}), then you'd be in the clear. Barring that, you could implement new variants of {{KuduWriteOperation}} and friends that are PIMPL'ed, and modify the rest of the client APIs to support them alongside the existing variant. Users of the C++ client will need to change their code to use the new variants, but it'll be completely safe from a backwards compatibility perspective. The [KDE community wiki page on the subject|https://community.kde.org/Policies/Binary_Compatibility_Issues_With_C%2B%2B] has much more useful insight and some examples too. > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland >Priority: Major > Labels: newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723602#comment-16723602 ] Brock Noland commented on KUDU-1563: [~adar] > I am nervous about inflating the memory consumption of each operation, and > I'm not sure how to preserve backwards compatibility in the C++ client's > non-PIMPL'ed KuduWriteOperation class. If you can address both of these > concerns, I'd be open to per-operation configuration. >From a memory perspective, I think this can be implemented as a bitmask on an >integer which would consume little memory on a per-operation basis. I am not sure how to solve the PIMPL'ed issue either but I am happy to investigate. What would be implications be if we could not do this in a backwards compatible way? FWIW - I am sure someone outside Impala is using the C++ client, but in my customer base of 25+ Kudu users, we don't have a single one. Thus my gut tells me it's a very small number of users. > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland >Priority: Major > Labels: newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723598#comment-16723598 ] Brock Noland commented on KUDU-1563: bq. excuse me, I didn't mean to assign it to myself, assigned it back to you. No worries! The holidays typically very productive for my open source contributions so if we can get agreement on approach, I think I'll have it complete by the new year. > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland >Priority: Major > Labels: newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723538#comment-16723538 ] Mike Percy commented on KUDU-1563: -- I know I'm late to this party. I think it's worth modeling what SQL does and INSERT IGNORE in that context operates at a batch or operation level, not a session level. So it seems more of an impedance match to keep this type of error handling configuration at the operation or batch level from a client API perspective to avoid requiring SQL clients to constantly be setting session options if they are caching sessions. > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland >Priority: Major > Labels: newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723381#comment-16723381 ] Attila Bukor commented on KUDU-1563: Hi [~brocknoland], excuse me, I didn't mean to assign it to myself, assigned it back to you. > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland >Priority: Major > Labels: newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720459#comment-16720459 ] Brock Noland commented on KUDU-1563: Thanks [~r1pp3rj4ck] for picking this up. While I'd like to contribute it, I am more concerned with getting access to the feature. Is this something you have bandwidth to work on now? bq. I agree that operation level is more intuitive and more flexible, though I don't really see a use case for that added flexibility. Can you articulate one? I don't have a use case. Only a that feels slightly odd changing a session level parameter to define how a operation behaves. I am fine with the suggested approach it'll just take more work to implement. > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Attila Bukor >Priority: Major > Labels: newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713156#comment-16713156 ] Adar Dembo commented on KUDU-1563: -- bq. Why would we have the configuration at the session level? Why not put it at the operation level? I guess databases do have a session level configuration, but it feels odd to me that I am setting at the session level how an {{INSERT IGNORE}} should behave. How about we add an argument to the new operation which specifies the behavior of that operation? I agree that operation level is more intuitive and more flexible, though I don't really see a use case for that added flexibility. Can you articulate one? In any case, my concerns are implementation-specific: I am nervous about inflating the memory consumption of each operation, and I'm not sure how to preserve backwards compatibility in the C++ client's non-PIMPL'ed KuduWriteOperation class. If you can address both of these concerns, I'd be open to per-operation configuration. > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland >Priority: Major > Labels: newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713014#comment-16713014 ] Grant Henke commented on KUDU-1563: --- I am reading through and catching up on this. I think it would definitely be a nice feature to have. It looks like [~danburkert] also mentioned both the operation level setting and the session level setting: bq. I think I'm in favor of merging the current patch which introduces an INSERT IGNORE operation to ignore constraint violations of type 1 on the server side. Additionally, we should strongly consider adding a session-specific options to selectively ignore each type of constraint individually. So for example, the client could use the INSERT IGNORE operation type if they want to selectively ignore some instances of duplicate primary-key constraints, or it could call KuduSession::ignoreDuplicatePrimaryKeyViolations to ignore all of them for the entire session. I agree that the intuitive place to define the expected behavior would be on the operation. I am not sure if there is a big benefit to having both, but having it be session based only seams to reduce the flexibility of what a client can do. > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland >Priority: Major > Labels: newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712794#comment-16712794 ] Brock Noland commented on KUDU-1563: [~adar] - one thing feels a little odd about this. Why would we have the configuration at the session level? Why not put it at the operation level? I guess databases do have a session level configuration, but it feels odd to me that I am setting at the session level how an {{INSERT IGNORE}} should behave. How about we add an argument to the new operation which specifies the behavior of that operation? > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland >Priority: Major > Labels: newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16023757#comment-16023757 ] Brock Noland commented on KUDU-1563: Useful [link|https://mariadb.com/kb/en/mariadb/insert-on-duplicate-key-update/] to understand the differences between {{UPSERT}} and {{ON DUPLICATE KEY UPDATE}} > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland > Labels: newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16023372#comment-16023372 ] Dan Burkert commented on KUDU-1563: --- Just learned about a usecase that would be well-served by an {{ON DUPLICATE KEY UPDATE}} mechanism in Kudu. In particular, the workload is ingesting batches of timestamped records, with each record being quite large. Individual batches routinely contain duplicate records whose contents only differ by collection timestamp. Ideally as new batches are ingested, duplicate records would update the collection timestamp column, but skip updating the larger data columns. To do this effectively, we could have a duplicate-resolution strategy that updates individual columns to new values, effectively {{ON DUPLICATE KEY UPDATE}} with only constants allowed as the update value. To be efficient, and to map well to SQL, this should probably be specified once on the entire batch instead of on individual ops. > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland > Labels: newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE
[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15648540#comment-15648540 ] Dan Burkert commented on KUDU-1563: --- [~mjacobs] brings up a good point that the duplicate-key constraint on insert is not the only constraint when writing to Kudu: # duplicate primary-key constraint on insert # missing primary-key constraint on delete and update # missing range partition on any write # missing column value in column without default on insert Applications may want to 'ignore' any of these errors when writing to Kudu. Some of these errors are reported by the server (1, 2 and 4), and some are caught by the client before sending (3, and the client could check 4 but currently does not). Of these constraints, I think 1. is the most commonly ignored, and that's why we decided to add first-class support for it by adding a special operation type. Obviously that approach can't scale to all of the constraint types, much less their cross product. I think I'm in favor of merging the current patch which introduces an INSERT IGNORE operation to ignore constraint violations of type 1 on the server side. Additionally, we should strongly consider adding a session-specific options to selectively ignore each type of constraint individually. So for example, the client could use the INSERT IGNORE operation type if they want to selectively ignore some instances of duplicate primary-key constraints, or it could call {{KuduSession::ignoreDuplicatePrimaryKeyViolations}} to ignore all of them for the entire session. We would also expose flags for the rest of the constraint types. Finally, the client should expose how many violations of each type were ignored in the session statistics. > Add support for INSERT IGNORE > - > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature >Reporter: Dan Burkert >Assignee: Brock Noland > Labels: newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v6.3.4#6332)