Dan Burkert has posted comments on this change. Change subject: KUDU-1824. KuduRDD.collect fails because of NoSerializableException ......................................................................
Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/5636/2//COMMIT_MSG Commit Message: Line 7: KUDU-1824. KuduRDD.collect fails because of NoSerializableException > would be good to explain the approach in the commit message Done http://gerrit.cloudera.org:8080/#/c/5636/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala: Line 120: override def next(): Row = { > does this now introudce an extra allocation/copy in the non-RDD case (DataF I'm not entirely sure. I don't fully understand how these objects were previously being serialized. I guess the RDD is able to reach into Kudu an serialize our internal row block, and is smart enough to only do it once (not once per-row). Honestly I'm not sure how we would fix this, while keeping that behavior for the RDD case without copying all this code again. -- To view, visit http://gerrit.cloudera.org:8080/5636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I42618188003d2eef66088f3101803d1750e4134b Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
