Github user mattf commented on the pull request:
https://github.com/apache/spark/pull/1197#issuecomment-47135124
@rxin & @andrewor14
from what i can tell there are three issues here -
a. hang on simple job; reported as SPARK-2244 and SPARK-2242; root cause is
stderr buffer deadlock
b. masked output from shell subprocess; introduced by SPARK-1466; root
cause is lack of pass through for stderr
c. fragile port passing between child and parent in pyspark
all should be addressed in isolation (andrewor14, the fact that your patch
tries to address multiple concerns at the same time is why i'd prefer an
alternative).
i recommend -
. first, fix (a) w/ close() and resolve both SPARK-2242 and SPARK-2244
. second, file a bug for (b) and address it w/ enhanced exception handling
based on the current SPARK-2242 patch
. third, file a new bug for (c) with a solution that is yet to be
determined
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---