[ 
https://issues.apache.org/jira/browse/SPARK-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156776#comment-14156776
 ] 

cocoatomo commented on SPARK-3706:
----------------------------------

Thank you for the comment and modification, [~joshrosen].

Taking a quick look, this regression created at the commit 
[f38fab97c7970168f1bd81d4dc202e36322c95e3|https://github.com/apache/spark/commit/f38fab97c7970168f1bd81d4dc202e36322c95e3#diff-5dbcb82caf8131d60c73e82cf8d12d8aR107]
 on master branch.
Pushing "ipython" aside into a default value force us to set PYSPARK_PYTHON as 
"ipython", since PYSPARK_PYTHON defaults to "python" at the top of the 
./bin/pyspark script.
This issue is a regression between 1.1.0 and 1.2.0, therefore affects only 
1.2.0.

> Cannot run IPython REPL with IPYTHON set to "1" and PYSPARK_PYTHON unset
> ------------------------------------------------------------------------
>
>                 Key: SPARK-3706
>                 URL: https://issues.apache.org/jira/browse/SPARK-3706
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.2.0
>         Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0
>            Reporter: cocoatomo
>              Labels: pyspark
>
> h3. Problem
> The section "Using the shell" in Spark Programming Guide 
> (https://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) 
> says that we can run pyspark REPL through IPython.
> But a folloing command does not run IPython but a default Python executable.
> {quote}
> $ IPYTHON=1 ./bin/pyspark
> Python 2.7.8 (default, Jul  2 2014, 10:14:46) 
> ...
> {quote}
> the spark/bin/pyspark script on the commit 
> b235e013638685758885842dc3268e9800af3678 decides which executable and options 
> it use folloing way.
> # if PYSPARK_PYTHON unset
> #* → defaulting to "python"
> # if IPYTHON_OPTS set
> #* → set IPYTHON "1"
> # some python scripts passed to ./bin/pyspak → run it with ./bin/spark-submit
> #* out of this issues scope
> # if IPYTHON set as "1"
> #* → execute $PYSPARK_PYTHON (default: ipython) with arguments $IPYTHON_OPTS
> #* otherwise execute $PYSPARK_PYTHON
> Therefore, when PYSPARK_PYTHON is unset, python is executed though IPYTHON is 
> "1".
> In other word, when PYSPARK_PYTHON is unset, IPYTHON_OPS and IPYTHON has no 
> effect on decide which command to use.
> ||PYSPARK_PYTHON||IPYTHON_OPTS||IPYTHON||resulting command||expected command||
> |(unset → defaults to python)|(unset)|(unset)|python|(same)|
> |(unset → defaults to python)|(unset)|1|python|ipython|
> |(unset → defaults to python)|an_option|(unset → set to 1)|python 
> an_option|ipython an_option|
> |(unset → defaults to python)|an_option|1|python an_option|ipython an_option|
> |ipython|(unset)|(unset)|ipython|(same)|
> |ipython|(unset)|1|ipython|(same)|
> |ipython|an_option|(unset → set to 1)|ipython an_option|(same)|
> |ipython|an_option|1|ipython an_option|(same)|
> h3. Suggestion
> The pyspark script should determine firstly whether a user wants to run 
> IPython or other executables.
> # if IPYTHON_OPTS set
> #* set IPYTHON "1"
> # if IPYTHON has a value "1"
> #* PYSPARK_PYTHON defaults to "ipython" if not set
> # PYSPARK_PYTHON defaults to "python" if not set
> See the pull request for more detailed modification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to