GitHub user fhoering opened a pull request:

    https://github.com/apache/spark/pull/22422

    [SPARK-25433][PYSPARK] Add support for pex in PySpark

    ## What changes were proposed in this pull request?
    
    This change aims to provide the very basic support to provision the 
executors with pex files (instead of conda or virtual env). It contains only 
the minimal required changes. Everything else can be setup with environment 
variables.
    Similar to how it works today with conda the user needs to make sure that 
he has the same environment when submitting the Spark job and the environment 
provided in the pex file.
    
    
    ## How was this patch tested?
    
    Various runs with spark-submit (client, cluster) and by directly creating a 
SparkContext a Yarn Cluster.
    
    Also tested with a unit test using a locally created pex file (inspired by 
python/pyspark/tests/AddFileTests). The issue is that the pex files contains 
information about the platform and therefore I can't provide a generic test 
because of the check to have the same python environment on the client and on 
executors. It would mean to create a pex file for every python runtime and 
every existing platform. I can provide the unit test in here and a script to 
create the pex environment in order to execute locally. It also might be 
possible to use SparkSubmitTests as it calls a separate process (haven't tried 
yet)
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/fhoering/spark pex-support-fix2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22422.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22422
    
----
commit 3aeae95321f314b128d1ff86b4631145e558d43f
Author: Fabian Höring <f.horing@...>
Date:   2018-09-14T16:34:59Z

    Add support for pex in PySpark

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to