srowen commented on a change in pull request #23638: [SPARK-26713][CORE]
Interrupt pipe IO threads in PipedRDD when task is finished
URL: https://github.com/apache/spark/pull/23638#discussion_r251202175
##########
File path:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
##########
@@ -141,7 +141,14 @@ final class ShuffleBlockFetcherIterator(
/**
* Whether the iterator is still active. If isZombie is true, the callback
interface will no
- * longer place fetched blocks into [[results]].
+ * longer place fetched blocks into [[results]] and the iterator is marked
as fully consumed.
+ *
+ * When the iterator is inactive, [[hasNext]] and [[next()]] calls will
honor that as there are
+ * cases the iterator is still being consumed. For example, there was a race
condition for
+ * ShuffledRDD + PipedRDD if the subprocess command is failed. The task will
be marked as failed,
+ * then the iterator will be cleaned up at task completion, the [[next()]]
call(called in the
Review comment:
nit: space in call(called
I also don't know that `[[` will do anything here as scaladoc won't be
generated for this anyway
"there was a race condition" -> don't know if it's useful, just explain what
the code is doing now, rather than the previous issue
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]