Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/11571 )
Change subject: KUDU-2563: [spark] Use the scanner keepAlive API ...................................................................... Patch Set 2: (7 comments) > It would be good to double check that KuduScanner.keepAlive() is well behaved > when the scan is "in between" two tablets. Ideally that would do nothing, but > at the very least it shouldn't try to send a keep alive RPC into the ether > and then throw an exception when that fails. This is already verified in the keepAlive API unit tests and given that there is no exposure to that here, I don't think it's worth testing here as well. http://gerrit.cloudera.org:8080/#/c/11571/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala: http://gerrit.cloudera.org:8080/#/c/11571/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala@125 PS1, Line 125: * @param scanner the wrapped scanner : * @param kuduContext the kudu context > Update this. Done http://gerrit.cloudera.org:8080/#/c/11571/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala@141 PS1, Line 141: * Calls the keepAlive API on the current scanner if the keepAlivePeriodMs has passed. > I wish I had remembered earlier, but this isn't safe; scanners aren't threa Ah, right. Makes sense. I will switch to the read loop model then. http://gerrit.cloudera.org:8080/#/c/11571/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala@161 PS1, Line 161: } > Don't we need to do this if we were interrupted too? So maybe wrap the whol We wouldn't have needed to, given the interrupt only exists for shutting down. But I am removing the thread model anyway. http://gerrit.cloudera.org:8080/#/c/11571/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala: http://gerrit.cloudera.org:8080/#/c/11571/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala@19 PS2, Line 19: import java.util.concurrent.Executors > Can you double check that you still need all of these new imports? Done http://gerrit.cloudera.org:8080/#/c/11571/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala@146 PS2, Line 146: LOG.error(s"Calling keepAlive on ${scanner.currentTablet}") > For debugging only? yep. removed. http://gerrit.cloudera.org:8080/#/c/11571/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala@147 PS2, Line 147: scanner.keepAlive() > Is it OK if this fails and throws a KuduException? yes, I would expect an exception to fail the job. http://gerrit.cloudera.org:8080/#/c/11571/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduReadOptions.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduReadOptions.scala: http://gerrit.cloudera.org:8080/#/c/11571/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduReadOptions.scala@53 PS1, Line 53: val defaultKeepAlivePeriodMs: Long = 15000 // 25% of the default scanner ttl. > Should doc the choice in value here, probably related to the default value Done -- To view, visit http://gerrit.cloudera.org:8080/11571 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7f26d6ab8deb24982055d247938a11e188c35db Gerrit-Change-Number: 11571 Gerrit-PatchSet: 2 Gerrit-Owner: Grant Henke <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Comment-Date: Thu, 04 Oct 2018 23:41:16 +0000 Gerrit-HasComments: Yes
