Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/19840
I didn't read the entire thread here but what you want is this:
--archives hdfs:///python36/python36.tgz#python36 --conf
spark.pyspark.python=./python36/bin/python3.6 --conf
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19840
Build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user yaooqinn commented on the issue:
https://github.com/apache/spark/pull/19840
@vanzin I am not very familiar with python part
[context.py#L191](https://github.com/yaooqinn/spark/blob/8ff5663fe9a32eae79c8ee6bc310409170a8da64/python/pyspark/context.py#L191),
so handle it at
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19840
@yaooqinn do you plan to update this PR?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19840
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19840
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1718/
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19840
So, any updates here?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19840
That's what I said in my comment ("except for the driver python config").
---
-
To unsubscribe, e-mail:
Github user yaooqinn commented on the issue:
https://github.com/apache/spark/pull/19840
@vanzin
according to @ueshin 's explanation, `PYSPARK_DRIVER_PYTHON` is only for
driver, if executor follows the order of
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19840
I'm trying to understand what is
https://github.com/apache/spark/blob/master/python/pyspark/context.py#L191
really achieving. It seems pretty broken to me and feels like the whole
`pythonExec`
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19840
@yaooqinn It is used for executors.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user yaooqinn commented on the issue:
https://github.com/apache/spark/pull/19840
@ueshin
[context.py#L191](https://github.com/yaooqinn/spark/blob/8ff5663fe9a32eae79c8ee6bc310409170a8da64/python/pyspark/context.py#L191)
set for both driver and executor?
---
Github user yaooqinn commented on the issue:
https://github.com/apache/spark/pull/19840
@ueshin i see.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19840
@yaooqinn I meant it is not used for `pythonExec`.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user yaooqinn commented on the issue:
https://github.com/apache/spark/pull/19840
I can `spark.executorEnv.PYSPARK_PYTHON` in `sparkConf` at executor side ,
because it is set at
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19840
@yaooqinn OK, I see the situation.
In client mode, I think we can't use `spark.yarn.appMasterEnv.XXX` which is
for cluster mode. So we should use environment variable `PYSPARK_PYTHON` or
Github user yaooqinn commented on the issue:
https://github.com/apache/spark/pull/19840
@ueshin case 8 should be client deploy mode, excuse me for copy mistake,
fixed
---
-
To unsubscribe, e-mail:
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19840
@yaooqinn What's the difference between case 7 and 8? Looks like the same
configuration but the different result?
---
-
To
Github user yaooqinn commented on the issue:
https://github.com/apache/spark/pull/19840
use spark-2.2.0-bin-hadoop2.7 numpy
examples/src/main/python/mllib/correlations_example.py
### case 1
|key|value|
|---|---|
Github user yaooqinn commented on the issue:
https://github.com/apache/spark/pull/19840
@vanzin PYSPARK_DRIVER_PYTHON won't work because
[context.py#L191](https://github.com/yaooqinn/spark/blob/8ff5663fe9a32eae79c8ee6bc310409170a8da64/python/pyspark/context.py#L191)
does't deal with
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19840
Instead of setting `PYSPARK_PYTHON=~/anaconda3/envs/py3/bin/python`, what
happens if you set `PYSPARK_DRIVER_PYTHON=~/anaconda3/envs/py3/bin/python`?
---
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/19840
I'm a little concerned about such changes, this may be misconfigured to
introduce the discrepancy between driver python and executor python, at least
we should honor this configuration
Github user yaooqinn commented on the issue:
https://github.com/apache/spark/pull/19840
Yes, you are right. we should use same python executables. But the **same**
might mean binary same not just same path
---
-
To
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/19840
Oh, I see. You're running in client mode. So this one `--conf
spark.yarn.appMasterEnv.PYSPARK_PYTHON=py3.zip/py3/bin/python` is useless.
So I guess the behavior is expected. Because
Github user yaooqinn commented on the issue:
https://github.com/apache/spark/pull/19840
https://user-images.githubusercontent.com/8326978/33471349-e570953e-d6a7-11e7-9fec-74963efe37d2.png;>
@jerryshao ENVs are specified ok by yarn, but the `pythonExec` is generated
in
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/19840
I think in YARN we have several different ways to set `PYSPARK_PYTHON`, I
guess your issue is that which one should take priority?
Can you please:
1. Define a consistent ordering
Github user yaooqinn commented on the issue:
https://github.com/apache/spark/pull/19840
i guess specifing `PYSPARK_PYTHON=~/anaconda3/envs/py3/bin/python`
overwrites spark.executorEnv.PYSPARK_PYTHON by
Github user yaooqinn commented on the issue:
https://github.com/apache/spark/pull/19840
@ueshin cluster mode working, client not
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19840
Should we set the `pythonExec` during the initialization of `SparkContext`
at
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/19840
cc @cloud-fan @ueshin
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19840
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84280/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19840
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19840
**[Test build #84280 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84280/testReport)**
for PR 19840 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19840
**[Test build #84280 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84280/testReport)**
for PR 19840 at commit
34 matches
Mail list logo