Dan Burkert has posted comments on this change. Change subject: add the wrapper for the Iterator[Row] of KuduRDD by using Spark InterruptibleIterator[Row] ......................................................................
Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/7753/1//COMMIT_MSG Commit Message: Line 9: once we the submit a spark job to spark cluster, we cannot cancel or stop the kudu-spark nit: Don't indent message body, and uppercase 'Once'. Wrap at 72 characters. PS1, Line 10: Aftering using nit: s/Aftering using/With http://gerrit.cloudera.org:8080/#/c/7753/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala: Line 21: import org.apache.spark.InterruptibleIterator Add this import to the braced block on line 24. Line 65: new InterruptibleIterator[Row](taskContext, new RowIterator(scanner)) InterruptibleIterator checks the status for every single row, perhaps it would be more efficient to only check when a new scan batch is needed from Kudu? The check is pretty simple: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/InterruptibleIterator.scala#L36. That check could slot in right before the 'scanner.nextRows()' call on line 91. -- To view, visit http://gerrit.cloudera.org:8080/7753 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0b4284f2c0a40cd7ba8cf2b76e0403592552c814 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: caiconghui <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: caiconghui <[email protected]> Gerrit-HasComments: Yes
