Will Berkeley has posted comments on this change.

Change subject: KUDU-1581 Fix DataFrame read failure when table has Binary Col

Patch Set 1:


Good catch!

Commit Message:

PS1, Line 7: Fix DataFrame read failure when table has Binary Col
Would you mind including one or two sentences explaining what the cause was and 
what the solution is? Namely, that we need to make a copy because no-copy 
essentially returns a pointer, and it didn't make sense to send a pointer over 
the wire without the data.

File java/kudu-spark/pom.xml:

PS1, Line 58: <dependency>
            :             <groupId>org.apache.spark</groupId>
            :             <version>${spark.version}</version>
            :         </dependency>
I don't think we need this. See comment in KuduContextTest.scala.


PS1, Line 38: test("Test kudu-spark DataFrame")
I think I'd rather have the Test basic kuduRDD read back all the columns, not 
just the key, and see that we got the correct values back from all columns. 
Keeping that test in sync with all of Kudu's types going forward will make sure 
we don't hit a bug like this again for any type.

PS1, Line 43: new HiveContext(sc)
I don't think we need a full HiveContext. See L57 of DefaultSourceTest.scala.

PS1, Line 47: ] shouldBe "bytes".g
While we don't have style guidelines for Scala, the pattern has been to use . 
throughout. IIRC, that's also the Spark style guideline.


PS1, Line 110: row.addBinary(8, "bytes".getBytes())
Could you vary this like like we do for the string type? Odd/even variation is 
fine, just some small amount.

To view, visit http://gerrit.cloudera.org:8080/4145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e3926d85fab7efb407325c9992c3fccdab04bad
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Ram Mettu <ram.me...@rms.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-HasComments: Yes

Reply via email to