cocoatomo created SPARK-3706:
--------------------------------

             Summary: Cannot run IPython REPL with IPYTHON set to "1" and 
PYSPARK_PYTHON unset
                 Key: SPARK-3706
                 URL: https://issues.apache.org/jira/browse/SPARK-3706
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 1.1.0
         Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0
            Reporter: cocoatomo


h3. Problem

The section "Using the shell" in Spark Programming Guide 
(https://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) 
says that we can run pyspark REPL through IPython.
But a folloing command does not run IPython but a default Python executable.

{quote}
$ IPYTHON=1 ./bin/pyspark
Python 2.7.8 (default, Jul  2 2014, 10:14:46) 
...
{quote}

the spark/bin/pyspark script on the commit 
b235e013638685758885842dc3268e9800af3678 decides which executable and options 
it use folloing way.

# if PYSPARK_PYTHON unset
#* → defaulting to "python"
# if IPYTHON_OPTS set
#* → set IPYTHON "1"
# some python scripts passed to ./bin/pyspak → run it with ./bin/spark-submit
#* out of this issues scope
# if IPYTHON set as "1"
#* → execute $PYSPARK_PYTHON (default: ipython) with arguments $IPYTHON_OPTS
#* otherwise execute $PYSPARK_PYTHON

Therefore, when PYSPARK_PYTHON is unset, python is executed though IPYTHON is 
"1".
In other word, when PYSPARK_PYTHON is unset, IPYTHON_OPS and IPYTHON has no 
effect on decide which command to use.

||PYSPARK_PYTHON||IPYTHON_OPTS||IPYTHON||resulting command||expected command||
|(unset → defaults to python)|(unset)|(unset)|python|(same)|
|(unset → defaults to python)|(unset)|1|python|ipython|
|(unset → defaults to python)|an_option|(unset → set to 1)|python 
an_option|ipython an_option|
|(unset → defaults to python)|an_option|1|python an_option|ipython an_option|
|ipython|(unset)|(unset)|ipython|(same)|
|ipython|(unset)|1|ipython|(same)|
|ipython|an_option|(unset → set to 1)|ipython an_option|(same)|
|ipython|an_option|1|ipython an_option|(same)|


h3. Suggestion

The pyspark script should determine firstly whether a user wants to run IPython 
or other executables.

# if IPYTHON_OPTS set
#* set IPYTHON "1"
# if IPYTHON has a value "1"
#* PYSPARK_PYTHON defaults to "ipython" if not set
# PYSPARK_PYTHON defaults to "python" if not set

See the pull request for more detailed modification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to