Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18211 )

Change subject: [java] KUDU-3350 add the support for deleteIgnoreRows
......................................................................


Patch Set 4:

(4 comments)

Thank you for the patch!

http://gerrit.cloudera.org:8080/#/c/18211/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18211/4//COMMIT_MSG@9
PS4, Line 9: Spark launches a speculative (duplicate) task for the long running 
task. If
           : the task runs deleting operations on kudu, it will cause 'key not 
found'
           : issue. This patch adds the basic functionality to support 
speculative
           : deleting tasks by adding deleteIgnoreRows.
I'm not sure I understand how it's relevant to talk about duplicated tasks to 
delete same rows.  There might be many other scenarios which could send in a 
DELETE operation when the target row isn't present in the table.

Maybe, simply state that this patch adds a new deleteIgnoreRows() wrapper for 
DELETE_IGNORE operations introduced with KUDU-1563 (see 
https://github.com/apache/kudu/commit/7fbe341e51a9e4245d8b3017cecf11e393c3a22b)?


http://gerrit.cloudera.org:8080/#/c/18211/4/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
File 
java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala:

http://gerrit.cloudera.org:8080/#/c/18211/4/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@334
PS4, Line 334:  that have already been deleted.
nit: remove this -- those absent rows might never be there in the first place, 
right?


http://gerrit.cloudera.org:8080/#/c/18211/4/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@347
PS4, Line 347: ${numDeletes.value.get(tableName)}
Is this going to be reported as an actual number of deleted rows or just the 
number of issued DELETE_IGNORE operations?


http://gerrit.cloudera.org:8080/#/c/18211/4/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
File 
java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala:

http://gerrit.cloudera.org:8080/#/c/18211/4/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala@74
PS4, Line 74: testDuplicateDelete
In addition to this testcase with duplicated delete operations, maybe it's 
worth adding a very simple test to make sure that deleting anything from an 
empty table using DELETE_IGNORE always succeeds?



--
To view, visit http://gerrit.cloudera.org:8080/18211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f89ced9ffa4a79f46661873f01c38aefb1d78d5
Gerrit-Change-Number: 18211
Gerrit-PatchSet: 4
Gerrit-Owner: Hongjiang Zhang <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Hongjiang Zhang <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Sat, 12 Feb 2022 01:33:21 +0000
Gerrit-HasComments: Yes

Reply via email to