GitHub user cocoatomo opened a pull request:

    https://github.com/apache/spark/pull/2554

    [SPARK-3706][PySpark] Cannot run IPython REPL with IPYTHON set to "1" and 
PYSPARK_PYTHON unset

    ### Problem
    
    The section "Using the shell" in Spark Programming Guide 
(https://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) 
says that we can run pyspark REPL through IPython.
    But a folloing command does not run IPython but a default Python executable.
    
    {quote}
    $ IPYTHON=1 ./bin/pyspark
    Python 2.7.8 (default, Jul  2 2014, 10:14:46) 
    ...
    {quote}
    
    the spark/bin/pyspark script on the commit 
b235e013638685758885842dc3268e9800af3678 decides which executable and options 
it use folloing way.
    
    1. if PYSPARK_PYTHON unset
       * → defaulting to "python"
    2. if IPYTHON_OPTS set
       * → set IPYTHON "1"
    3. some python scripts passed to ./bin/pyspak → run it with 
./bin/spark-submit
       * out of this issues scope
    4. if IPYTHON set as "1"
       * → execute $PYSPARK_PYTHON (default: ipython) with arguments 
$IPYTHON_OPTS
       * otherwise execute $PYSPARK_PYTHON
    
    Therefore, when PYSPARK_PYTHON is unset, python is executed though IPYTHON 
is "1".
    In other word, when PYSPARK_PYTHON is unset, IPYTHON_OPS and IPYTHON has no 
effect on decide which command to use.
    
    PYSPARK_PYTHON | IPYTHON_OPTS | IPYTHON | resulting command | expected 
command
    ---- | ---- | ----- | ----- | -----
    (unset → defaults to python) | (unset) | (unset) | python | (same)
    (unset → defaults to python) | (unset) | 1 | python | ipython
    (unset → defaults to python) | an_option | (unset → set to 1) | python 
an_option | ipython an_option
    (unset → defaults to python) | an_option | 1 | python an_option | ipython 
an_option
    ipython | (unset) | (unset) | ipython | (same)
    ipython | (unset) | 1 | ipython | (same)
    ipython | an_option | (unset → set to 1) | ipython an_option | (same)
    ipython | an_option | 1 | ipython an_option | (same)
    
    
    ### Suggestion
    
    The pyspark script should determine firstly whether a user wants to run 
IPython or other executables.
    
    1. if IPYTHON_OPTS set
       * set IPYTHON "1"
    2.  if IPYTHON has a value "1"
       * PYSPARK_PYTHON defaults to "ipython" if not set
    3. PYSPARK_PYTHON defaults to "python" if not set
    
    See the pull request for more detailed modification.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cocoatomo/spark 
issues/cannot-run-ipython-without-options

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2554.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2554
    
----
commit 10d56fbc2703e919882610cc061b00481d009b88
Author: cocoatomo <cocoatom...@gmail.com>
Date:   2014-09-27T03:41:26Z

    [SPARK-3706][PySpark] Cannot run IPython REPL with IPYTHON set to "1" and 
PYSPARK_PYTHON unset

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to