[GitHub] spark pull request: [SPARK-13793] [CORE] PipedRDD doesn't propagat...

tejasapatil Tue, 15 Mar 2016 13:32:19 -0700

Github user tejasapatil commented on the pull request:

    https://github.com/apache/spark/pull/11628#issuecomment-197008054
  
    @srowen : The problem sequence you identified is correct. I agree that 
doing check both at the start and end is overkill (not to mention the overhead 
and ugliness). 
    
    >> it seems weird to proceed to check lines.hasNext if we know there's an 
error already. 
    
    Yes. The downside of check at the end is that doing `lines.hasNext` is 
waste of time *iff* there was an exception. But if there was already an 
exception, the `lines.hasNext` will *not* be blocking because we have already 
closed the stream in stdin writer at line 165 in PR so either there is nothing 
to be read  OR the underlying `BufferedSource` has already received the data 
and needs to return it.
    
    So I favored having the check in the end. It definitely covers all the 
cases compared to check in the beginning.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-13793] [CORE] PipedRDD doesn't propagat...

Reply via email to