Fengling Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/9834 )
Change subject: [spark]KUDU-2371: Add KuduWriteOptions class and ignoreNull option ...................................................................... Patch Set 4: (13 comments) http://gerrit.cloudera.org:8080/#/c/9834/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9834/3//COMMIT_MSG@7 PS3, Line 7: 1: Add KuduW > nit: KuduWriteOptions. Done http://gerrit.cloudera.org:8080/#/c/9834/3//COMMIT_MSG@7 PS3, Line 7: [spark]KU > nit: Could you also add a [spark] tag at the beginning of the subject line? Done http://gerrit.cloudera.org:8080/#/c/9834/3//COMMIT_MSG@10 PS3, Line 10: writes to the Kudu table > It's unclear what this applies to. Could you mention this is referring to w Done http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala: http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala@118 PS3, Line 118: case "insert" => Insert > This is a backwards incompatible change. Would it be possible to retain th Done http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala: http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@a250 PS3, Line 250: > Likewise, removing this will break applications. Is it possible to keep th Done http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@311 PS3, Line 311: if (errorCount > 0) { > I think you can skip the default value here, since it's a private method (s I can skip the default value in writePartitionRows but not writeRows since writeRows is called by insert(data: DataFrame, overwrite: Boolean) in DefaultSource.scala. Would you recommend to skip or not skip for writeRows? http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@310 PS3, Line 310: ors.getRowErrors.length : if (errorCount > 0) { > We don't have a Scala style guide, perse, but I think we should keep the fu Done http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@328 PS3, Line 328: leName) > You mean "Can't set primary key column to null". The message should also in Done http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduWriteOptions.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduWriteOptions.scala: http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduWriteOptions.scala@20 PS3, Line 20: clas > This isn't a case class. A case class is meant as a immutable sum type (wit Done http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduWriteOptions.scala@21 PS3, Line 21: var ignoreDuplicateRowErrors: Boolean = false, > This class is a very different style from the rest of the scala code we hav Done. Why using val here by the way? http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduWriteOptions.scala@27 PS3, Line 27: > Looks like you may have been inspired by https://www.dustinmartin.net/gette Done http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/OperationType.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/OperationType.scala: http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/OperationType.scala@a32 PS3, Line 32: : : : > I think we need to leave InsertIgnore in for API compatibility. Done http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala: http://gerrit.cloudera.org:8080/#/c/9834/3/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala@174 PS3, Line 174: kuduWriteOptions.ignoreDuplicateRowErrors = true : kuduContext.insertRows(updateDF, tableName, kuduWriteOptions) > We need to leave InsertIgnore in, but it would be good to do this test both Done -- To view, visit http://gerrit.cloudera.org:8080/9834 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide908ea29f572849eca0ba850ee197c1b22a07c8 Gerrit-Change-Number: 9834 Gerrit-PatchSet: 4 Gerrit-Owner: Fengling Wang <fw...@cloudera.com> Gerrit-Reviewer: Dan Burkert <danburk...@apache.org> Gerrit-Reviewer: Fengling Wang <fw...@cloudera.com> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com> Gerrit-Comment-Date: Tue, 03 Apr 2018 18:22:07 +0000 Gerrit-HasComments: Yes