Github user Sephiroth-Lin commented on the pull request:
https://github.com/apache/spark/pull/5478#issuecomment-93270239
@andrewor14 @sryza Yes, to assume that the python files will already be
present on the slave machines is not very reasonable. But if user want to use
PySpark, then they must compile the Spark in JDK1.6, but I think now most user
are use JDK1.7+. Maybe a good solution is package the PySpark in another jar
and automatically shipped by YARN to all containers. And add this jar to
PYTHONPATH with asseambly jar.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]