Dan Burkert has posted comments on this change.

Change subject: Kudu Spark Datasource for insertions and updates
......................................................................


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/2992/4/java/kudu-spark/src/main/scala/org/kududb/spark/kudu/DefaultSource.scala
File java/kudu-spark/src/main/scala/org/kududb/spark/kudu/DefaultSource.scala:

Line 87:     new KuduRelation(tableName, kuduMaster)(sqlContext)
Should this be returning the same KuduRelation created above on line 77?


Line 108:   @transient
Neither of these @transient tags should be necessary, since KuduRelation should 
not be serialized.


Line 142:     val columns = requiredColumns.intersect(primaryKeys) ++ 
(requiredColumns diff primaryKeys) //kudu requires the primary key to be the 
first columns retrieved
The column ordering requirement has been fixed as of 
https://github.com/apache/incubator-kudu/commit/7d57b8b9204a0b170fc204cb7aaa98ad995cf635


http://gerrit.cloudera.org:8080/#/c/2992/4/java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala
File java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala:

Line 91:   def tableExists(tableName: String) = 
syncClient.tableExists(tableName)
Is this returning a boolean?  I think return type is mandatory in Scala.


Line 98:   def deleteTable(tableName: String) = 
syncClient.deleteTable(tableName)
same here


Line 110:                   options: CreateTableOptions): Unit = {
Unit is redundant


Line 156:                     tableName: String,
indentation


Line 160:     val columns = 
(table.getSchema.getColumns.asScala).map(_.getName()) // gets the columns in 
order so the primary key is first
This shouldn't be necessary anymore.  I think the most efficient thing to do is 
to set the Kudu PartialRow fields by index instead of by column name, this will 
save a bunch of internal hash map lookups which you could do once at the start.


-- 
To view, visit http://gerrit.cloudera.org:8080/2992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I74f20a6c17c47f424dfda62854963dc19c3b78c3
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Chris George <[email protected]>
Gerrit-Reviewer: Andy Grove <[email protected]>
Gerrit-Reviewer: Dan Burkert <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes

Reply via email to