Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11628#discussion_r55650311
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/PipedRDD.scala ---
    @@ -157,8 +167,16 @@ private[spark] class PipedRDD[T: ClassTag](
         val lines = Source.fromInputStream(proc.getInputStream).getLines()
         new Iterator[String] {
           def next(): String = lines.next()
    +
    +      private def propagateChildThreadException(): Unit = {
    --- End diff --
    
    Why a method here? actually this whole iterator implementation looks like 
it needs some work. `hasNext` is supposed to be idempotent, but it's doing the 
work of checking the status. You'd never see an exception if you never called 
it.
    
    Why not check the result of the process exit before returning the iterator? 
then you just return an iterator from `lines`. This is already effectively 
blocking on the return from the process anyway.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to