[kudu-CR] KUDU-1581 Fix DataFrame read failure when table has Binary Col

Will Berkeley (Code Review) Sat, 27 Aug 2016 11:53:58 -0700

Will Berkeley has posted comments on this change.

Change subject: KUDU-1581 Fix DataFrame read failure when table has Binary Col
......................................................................

Patch Set 1:

(6 comments)

Good catch!

http://gerrit.cloudera.org:8080/#/c/4145/1//COMMIT_MSG
Commit Message:

PS1, Line 7: Fix DataFrame read failure when table has Binary Col
Would you mind including one or two sentences explaining what the cause was and
what the solution is? Namely, that we need to make a copy because no-copy
essentially returns a pointer, and it didn't make sense to send a pointer over
the wire without the data.

http://gerrit.cloudera.org:8080/#/c/4145/1/java/kudu-spark/pom.xml
File java/kudu-spark/pom.xml:

PS1, Line 58: <dependency>
: <groupId>org.apache.spark</groupId>
:
<artifactId>spark-hive_${scala.binary.version}</artifactId>
: <version>${spark.version}</version>
: </dependency>
I don't think we need this. See comment in KuduContextTest.scala.

http://gerrit.cloudera.org:8080/#/c/4145/1/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/KuduContextTest.scala
File
java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/KuduContextTest.scala:

PS1, Line 38: test("Test kudu-spark DataFrame")
I think I'd rather have the Test basic kuduRDD read back all the columns, not
just the key, and see that we got the correct values back from all columns.
Keeping that test in sync with all of Kudu's types going forward will make sure
we don't hit a bug like this again for any type.

PS1, Line 43: new HiveContext(sc)
I don't think we need a full HiveContext. See L57 of DefaultSourceTest.scala.

PS1, Line 47: ] shouldBe "bytes".g
While we don't have style guidelines for Scala, the pattern has been to use .
throughout. IIRC, that's also the Spark style guideline.

http://gerrit.cloudera.org:8080/#/c/4145/1/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala
File
java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala:

PS1, Line 110: row.addBinary(8, "bytes".getBytes())
Could you vary this like like we do for the string type? Odd/even variation is
fine, just some small amount.

--
To view, visit http://gerrit.cloudera.org:8080/4145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e3926d85fab7efb407325c9992c3fccdab04bad
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Ram Mettu <ram.me...@rms.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1581 Fix DataFrame read failure when table has Binary Col

Reply via email to