Will Berkeley has posted comments on this change.
Change subject: KUDU-1581 Fix DataFrame read failure when table has Binary Col
Patch Set 1:
PS1, Line 7: Fix DataFrame read failure when table has Binary Col
Would you mind including one or two sentences explaining what the cause was and
what the solution is? Namely, that we need to make a copy because no-copy
essentially returns a pointer, and it didn't make sense to send a pointer over
the wire without the data.
PS1, Line 58: <dependency>
I don't think we need this. See comment in KuduContextTest.scala.
PS1, Line 38: test("Test kudu-spark DataFrame")
I think I'd rather have the Test basic kuduRDD read back all the columns, not
just the key, and see that we got the correct values back from all columns.
Keeping that test in sync with all of Kudu's types going forward will make sure
we don't hit a bug like this again for any type.
PS1, Line 43: new HiveContext(sc)
I don't think we need a full HiveContext. See L57 of DefaultSourceTest.scala.
PS1, Line 47: ] shouldBe "bytes".g
While we don't have style guidelines for Scala, the pattern has been to use .
throughout. IIRC, that's also the Spark style guideline.
PS1, Line 110: row.addBinary(8, "bytes".getBytes())
Could you vary this like like we do for the string type? Odd/even variation is
fine, just some small amount.
To view, visit http://gerrit.cloudera.org:8080/4145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Owner: Ram Mettu <ram.me...@rms.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>