Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12101 )

Change subject: Create parallelized loader Spark job
......................................................................


Patch Set 2:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/12101/1/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala
File 
java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala:

PS1:
> License header.
Done


http://gerrit.cloudera.org:8080/#/c/12101/1/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala@24
PS1, Line 24: import org.apache.kudu.client.PartialRow
            : import org.apache.kudu.client.SessionConfiguration
            : import org.apache.kudu.spark.kudu.KuduContext
            : import 
org.apache.kudu.spark.tools.DistributedDataGeneratorOptions._
            : import org.apache.kudu.util.Dat
> Unused?
Done


http://gerrit.cloudera.org:8080/#/c/12101/1/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala@46
PS1, Line 46:     GeneratorMetrics(sc.longAccumulator("rows_written"), 
sc.longAccumulator("row_collisions"))
> Why does this use the KuduClient directly instead of using KuduContext or s
I am using the client API directly because we are inserting PartialRows instead 
of Spark rows. I can however use the context to get the client. I will do that.


http://gerrit.cloudera.org:8080/#/c/12101/1/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala@75
PS1, Line 75:
> Is this really indicative of a collision?
Perhaps not. This is a carry over from mikes patch. I suspect the idea
is to handle retires. I will remove it for now and add back if needed.


http://gerrit.cloudera.org:8080/#/c/12101/1/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala@83
PS1, Line 83:       // Synchronously flush on potentially the last iteration of 
the
> It's tough to do that because we only get those errors on auto- or manual f
Done


http://gerrit.cloudera.org:8080/#/c/12101/1/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/DistributedDataGenerator.scala@219
PS1, Line 219: ional()
> Nit: Default: ...
Done


http://gerrit.cloudera.org:8080/#/c/12101/1/java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/DistributedDataGeneratorTest.scala
File 
java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/DistributedDataGeneratorTest.scala:

PS1:
> License header.
Done


http://gerrit.cloudera.org:8080/#/c/12101/1/java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/DistributedDataGeneratorTest.scala@32
PS1, Line 32:   val log: Logger = LoggerFactory.getLogger(getClass)
> I probably originally wrote this part of the test before handing this off t
I am happy to remove it to make things more "default".



--
To view, visit http://gerrit.cloudera.org:8080/12101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibdfd41a21a7f80d22125c7f4e5ca4ed62c31709d
Gerrit-Change-Number: 12101
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Comment-Date: Thu, 20 Dec 2018 17:58:39 +0000
Gerrit-HasComments: Yes

Reply via email to