GitHub user yaooqinn opened a pull request:
https://github.com/apache/spark/pull/19840
[SPARK-22640][PYSPARK][YARN]switch python exec in executor side
## What changes were proposed in this pull request?
```
PYSPARK_PYTHON=~/anaconda3/envs/py3/bin/python \
bin/spark-submit --master yarn --deploy-mode client \
--archives ~/anaconda3/envs/py3.zip \
--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=py3.zip/py3/bin/python \
--conf spark.executorEnv.PYSPARK_PYTHON=py3.zip/py3/bin/python \
/home/hadoop/data/apache-spark/spark-2.1.2-bin-hadoop2.7/examples/src/main/python/mllib/correlations_example.py
```
In the case above, I created a python environment, delivered it via
`--arichives`, then visited it on Executor Node via
`spark.executorEnv.PYSPARK_PYTHON`.
But Executor seemed to use `PYSPARK_PYTHON=~/anaconda3/envs/py3/bin/python`
instead of `spark.executorEnv.PYSPARK_PYTHON=py3.zip/py3/bin/python`, then
application end with ioe.
this pr aim to switch the python exec when user specifies it.
## How was this patch tested?
manually verified with the case above.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/yaooqinn/spark SPARK-22640
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19840.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19840
----
commit 8ff5663fe9a32eae79c8ee6bc310409170a8da64
Author: Kent Yao <[email protected]>
Date: 2017-11-29T03:26:47Z
switch python exec in executor side
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]