GitHub user yaooqinn opened a pull request:

    https://github.com/apache/spark/pull/19840

    [SPARK-22640][PYSPARK][YARN]switch python exec in executor side

    ## What changes were proposed in this pull request?
    ```
    PYSPARK_PYTHON=~/anaconda3/envs/py3/bin/python \
    bin/spark-submit --master yarn --deploy-mode client \ 
    --archives ~/anaconda3/envs/py3.zip \
    --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=py3.zip/py3/bin/python \
    --conf spark.executorEnv.PYSPARK_PYTHON=py3.zip/py3/bin/python  \
    
/home/hadoop/data/apache-spark/spark-2.1.2-bin-hadoop2.7/examples/src/main/python/mllib/correlations_example.py
    ```
    In the case above, I created a python environment, delivered it via 
`--arichives`, then visited it on Executor Node via 
`spark.executorEnv.PYSPARK_PYTHON`.
    But Executor seemed to use `PYSPARK_PYTHON=~/anaconda3/envs/py3/bin/python` 
instead of `spark.executorEnv.PYSPARK_PYTHON=py3.zip/py3/bin/python`, then 
application end with ioe.
    
    this pr aim to switch the python exec when user specifies it.
    
    ## How was this patch tested?
    
    manually verified with the case above.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yaooqinn/spark SPARK-22640

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19840.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19840
    
----
commit 8ff5663fe9a32eae79c8ee6bc310409170a8da64
Author: Kent Yao <[email protected]>
Date:   2017-11-29T03:26:47Z

    switch python exec in executor side

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to