Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/30#issuecomment-41764738
Also, we should definitely document how to set up PySpark on YARN, so the
user doesn't have to jump through hoops to get a simple job running. The
biggest thing is probably emphasize that it only works if we build with maven.
Maybe we should also have a section that explains what to do when you run into
the unhelpful `java.io.EOFException`. Or better still, throw a nicer
exception message that prints out the PYTHONPATH and complains that it can't
find pyspark.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---