Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12087 )

Change subject: KUDU-2640: Add Spark Structured Streaming Sink
......................................................................


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/12087/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12087/2//COMMIT_MSG@9
PS2, Line 9: patche
> typo
Done


http://gerrit.cloudera.org:8080/#/c/12087/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File 
java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:

http://gerrit.cloudera.org:8080/#/c/12087/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala@216
PS2, Line 216:   private def getOperationType(parameters: Map[String, String]): 
OperationType = {
             :     
parameters.get(OPERATION).map(stringToOperationType).getOrElse(Upsert)
             :   }
> Hrm, I get why this is the case for KuduSink, but should it be the case for
I didn't change this behavior. I just refactored it into the method from above. 
The reason upsert is the default is because in order to correctly handle Spark 
retires and upsert is need. A better choice might be insert ignore which is 
tracked by KUDU-1563.


http://gerrit.cloudera.org:8080/#/c/12087/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala@466
PS2, Line 466: batchId: Long
> May be obvious, but mind adding a small note on why we shouldn't use this?
Done


http://gerrit.cloudera.org:8080/#/c/12087/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala@466
PS2, Line 466: batchId: Long
> Yeah, a comment would be nice. I'm assuming this is for de-duplication in t
Like Mike said the batchId is provided by spark so you can handle dedupes in 
the case of retries. Kudu doesn't have a way to leverage it currently. Today, 
we use upsert and in the future we could use insert ignore.



--
To view, visit http://gerrit.cloudera.org:8080/12087
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731e35f82c8cca7d911e4d879aa6853112132b17
Gerrit-Change-Number: 12087
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Comment-Date: Wed, 09 Jan 2019 20:50:35 +0000
Gerrit-HasComments: Yes

Reply via email to