Github user tejasapatil commented on the pull request:
https://github.com/apache/spark/pull/11628#issuecomment-197008054
@srowen : The problem sequence you identified is correct. I agree that
doing check both at the start and end is overkill (not to mention the overhead
and ugliness).
>> it seems weird to proceed to check lines.hasNext if we know there's an
error already.
Yes. The downside of check at the end is that doing `lines.hasNext` is
waste of time *iff* there was an exception. But if there was already an
exception, the `lines.hasNext` will *not* be blocking because we have already
closed the stream in stdin writer at line 165 in PR so either there is nothing
to be read OR the underlying `BufferedSource` has already received the data
and needs to return it.
So I favored having the check in the end. It definitely covers all the
cases compared to check in the beginning.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]