Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11147 )

Change subject: Create parallelized loader Spark job
......................................................................


Patch Set 4:

(10 comments)

Not sure if you meant to abandon this or not; seemed like that was an accident.

http://gerrit.cloudera.org:8080/#/c/11147/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11147/4//COMMIT_MSG@14
PS4, Line 14: marke
marked


http://gerrit.cloudera.org:8080/#/c/11147/4/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala
File 
java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala:

PS4: 
License header.


http://gerrit.cloudera.org:8080/#/c/11147/4/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala@24
PS4, Line 24: case class TableOptions(
            :     numPartitions: Int,
            :     replicationFactor: Int,
            :     numColumns: Int,
            :     intColumnPercentage: Float)
Unused?


http://gerrit.cloudera.org:8080/#/c/11147/4/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala@46
PS4, Line 46:     val kuduClient = new 
KuduClientBuilder(options.masterAddresses).build()
Why does this use the KuduClient directly instead of using KuduContext or 
something like that? Will this work with a Kerberized cluster?


http://gerrit.cloudera.org:8080/#/c/11147/4/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala@75
PS4, Line 75: isServiceUnavailable
Is this really indicative of a collision?


http://gerrit.cloudera.org:8080/#/c/11147/4/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala@83
PS4, Line 83:       rowsWritten += 1
The subtraction and addition is a little weird. Maybe you can add the happy 
path into the if/else and increment rowsWritten there?


http://gerrit.cloudera.org:8080/#/c/11147/4/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala@218
PS4, Line 218: Defaults to
Nit: Default: ...

(to be consistent with the other options here.)


http://gerrit.cloudera.org:8080/#/c/11147/4/java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/DistributedDataGeneratorTest.scala
File 
java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/DistributedDataGeneratorTest.scala:

PS4:
License header.


http://gerrit.cloudera.org:8080/#/c/11147/4/java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/DistributedDataGeneratorTest.scala@19
PS4, Line 19:   private val TABLE_SCHEMA: Schema = {
Can't you use your fancy schema generator here to get better coverage?


http://gerrit.cloudera.org:8080/#/c/11147/4/java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/DistributedDataGeneratorTest.scala@32
PS4, Line 32:     .setNumReplicas(1)
Why this?



--
To view, visit http://gerrit.cloudera.org:8080/11147
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4434c9f5d709154037386b4c7be94045df162267
Gerrit-Change-Number: 11147
Gerrit-PatchSet: 4
Gerrit-Owner: Mike Percy <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 18 Dec 2018 00:32:59 +0000
Gerrit-HasComments: Yes

Reply via email to