Mitch Barnett has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 )
Change subject: [examples] Add basic Spark example written in Scala ...................................................................... Patch Set 4: Code-Review+1 (11 comments) Updated per Grant's feedback. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc@18 PS1, Line 18: = Kudu-Spark example README > nit: extra space Done http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc@40 PS1, Line 40: - KuduMasters: A String value consisting of a comma-separated list of Kudu Master Host addresses. This will need to be pointed to the Kudu cluster you wish to target. > Is this needed? Given the code is creating the table, we have control over Good call - probably better to give less control as to not cause additional failures. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc@43 PS1, Line 43: To specify a value at execution time, you'll specify the parameter name in the '-D<name>' format. For example, to set a different set of masters for the Kudu cluster from the command line and use a custom table name, set the property `KuduMasters` to a CSV of the master addresses in the form `host:port` and add a table name value, as shown: > nit: Move up near KUDU_MASTERS I didn't see any default listed for this value - it's inclusion is only due to it being a requirement of the SparkSession create statement. I suppose you could infer that 'local' is the default value, since any other value would require a spark master to be explicitly defined. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala File examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala: http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@14 PS1, Line 14: object SparkExample { > nit: val KuduMasters: String = ... Done http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@19 PS1, Line 19: val NameCol = "name" > Should this be configurable like KUDU_MASTERS? I went ahead and made it configurable, as to allow those who want to the option of changing where it runs. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@32 PS1, Line 32: > Nit: I think the below sytax is easier Went ahead and made that syntax change - bit easier to read. Also went ahead and explicitly defined 'false' for the structFields instead of the NULLABLE constant. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@39 PS1, Line 39: > nit: Replication factor of 1 might be easier for examples. Done http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@43 PS1, Line 43: ) > Can this be moved up by the other imports. From what I've read in the Spark documentation and what I've seen in the code base, it can't be moved. It's being called directly from the SparkSession we instantiated above. I could move it higher in main(), but I can't take it up to the other imports unfortunately. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@47 PS1, Line 47: if (!kc.tableExists(TableName)) { > Nit: Use slf4j logging. Done http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@77 PS1, Line 77: //create a DF and map the values of the KuduMaster and TableName values > No need to catch if there is no handling. Just let the exception bubble up. Done http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@87 PS1, Line 87: kc.deleteTable(TableName) > No need to catch here either. Done -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett <mbarn...@cloudera.com> Gerrit-Reviewer: Attila Bukor <abu...@apache.org> Gerrit-Reviewer: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett <mbarn...@cloudera.com> Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com> Gerrit-Comment-Date: Thu, 25 Oct 2018 23:11:25 +0000 Gerrit-HasComments: Yes