GitHub user fhoering opened a pull request:
https://github.com/apache/spark/pull/22422
[SPARK-25433][PYSPARK] Add support for pex in PySpark
## What changes were proposed in this pull request?
This change aims to provide the very basic support to provision the
executors with pex files (instead of conda or virtual env). It contains only
the minimal required changes. Everything else can be setup with environment
variables.
Similar to how it works today with conda the user needs to make sure that
he has the same environment when submitting the Spark job and the environment
provided in the pex file.
## How was this patch tested?
Various runs with spark-submit (client, cluster) and by directly creating a
SparkContext a Yarn Cluster.
Also tested with a unit test using a locally created pex file (inspired by
python/pyspark/tests/AddFileTests). The issue is that the pex files contains
information about the platform and therefore I can't provide a generic test
because of the check to have the same python environment on the client and on
executors. It would mean to create a pex file for every python runtime and
every existing platform. I can provide the unit test in here and a script to
create the pex environment in order to execute locally. It also might be
possible to use SparkSubmitTests as it calls a separate process (haven't tried
yet)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/fhoering/spark pex-support-fix2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22422.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22422
----
commit 3aeae95321f314b128d1ff86b4631145e558d43f
Author: Fabian Höring <f.horing@...>
Date: 2018-09-14T16:34:59Z
Add support for pex in PySpark
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]