Will Berkeley has posted comments on this change. Change subject: KUDU-1581 Fix DataFrame read failure when table has Binary Col ......................................................................
Patch Set 1: (6 comments) Good catch! http://gerrit.cloudera.org:8080/#/c/4145/1//COMMIT_MSG Commit Message: PS1, Line 7: Fix DataFrame read failure when table has Binary Col Would you mind including one or two sentences explaining what the cause was and what the solution is? Namely, that we need to make a copy because no-copy essentially returns a pointer, and it didn't make sense to send a pointer over the wire without the data. http://gerrit.cloudera.org:8080/#/c/4145/1/java/kudu-spark/pom.xml File java/kudu-spark/pom.xml: PS1, Line 58: <dependency> : <groupId>org.apache.spark</groupId> : <artifactId>spark-hive_${scala.binary.version}</artifactId> : <version>${spark.version}</version> : </dependency> I don't think we need this. See comment in KuduContextTest.scala. http://gerrit.cloudera.org:8080/#/c/4145/1/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/KuduContextTest.scala File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/KuduContextTest.scala: PS1, Line 38: test("Test kudu-spark DataFrame") I think I'd rather have the Test basic kuduRDD read back all the columns, not just the key, and see that we got the correct values back from all columns. Keeping that test in sync with all of Kudu's types going forward will make sure we don't hit a bug like this again for any type. PS1, Line 43: new HiveContext(sc) I don't think we need a full HiveContext. See L57 of DefaultSourceTest.scala. PS1, Line 47: ] shouldBe "bytes".g While we don't have style guidelines for Scala, the pattern has been to use . throughout. IIRC, that's also the Spark style guideline. http://gerrit.cloudera.org:8080/#/c/4145/1/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala: PS1, Line 110: row.addBinary(8, "bytes".getBytes()) Could you vary this like like we do for the string type? Odd/even variation is fine, just some small amount. -- To view, visit http://gerrit.cloudera.org:8080/4145 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3e3926d85fab7efb407325c9992c3fccdab04bad Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Ram Mettu <ram.me...@rms.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com> Gerrit-HasComments: Yes