GitHub user cocoatomo opened a pull request: https://github.com/apache/spark/pull/2554
[SPARK-3706][PySpark] Cannot run IPython REPL with IPYTHON set to "1" and PYSPARK_PYTHON unset ### Problem The section "Using the shell" in Spark Programming Guide (https://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) says that we can run pyspark REPL through IPython. But a folloing command does not run IPython but a default Python executable. {quote} $ IPYTHON=1 ./bin/pyspark Python 2.7.8 (default, Jul 2 2014, 10:14:46) ... {quote} the spark/bin/pyspark script on the commit b235e013638685758885842dc3268e9800af3678 decides which executable and options it use folloing way. 1. if PYSPARK_PYTHON unset * â defaulting to "python" 2. if IPYTHON_OPTS set * â set IPYTHON "1" 3. some python scripts passed to ./bin/pyspak â run it with ./bin/spark-submit * out of this issues scope 4. if IPYTHON set as "1" * â execute $PYSPARK_PYTHON (default: ipython) with arguments $IPYTHON_OPTS * otherwise execute $PYSPARK_PYTHON Therefore, when PYSPARK_PYTHON is unset, python is executed though IPYTHON is "1". In other word, when PYSPARK_PYTHON is unset, IPYTHON_OPS and IPYTHON has no effect on decide which command to use. PYSPARK_PYTHON | IPYTHON_OPTS | IPYTHON | resulting command | expected command ---- | ---- | ----- | ----- | ----- (unset â defaults to python) | (unset) | (unset) | python | (same) (unset â defaults to python) | (unset) | 1 | python | ipython (unset â defaults to python) | an_option | (unset â set to 1) | python an_option | ipython an_option (unset â defaults to python) | an_option | 1 | python an_option | ipython an_option ipython | (unset) | (unset) | ipython | (same) ipython | (unset) | 1 | ipython | (same) ipython | an_option | (unset â set to 1) | ipython an_option | (same) ipython | an_option | 1 | ipython an_option | (same) ### Suggestion The pyspark script should determine firstly whether a user wants to run IPython or other executables. 1. if IPYTHON_OPTS set * set IPYTHON "1" 2. if IPYTHON has a value "1" * PYSPARK_PYTHON defaults to "ipython" if not set 3. PYSPARK_PYTHON defaults to "python" if not set See the pull request for more detailed modification. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cocoatomo/spark issues/cannot-run-ipython-without-options Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2554.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2554 ---- commit 10d56fbc2703e919882610cc061b00481d009b88 Author: cocoatomo <cocoatom...@gmail.com> Date: 2014-09-27T03:41:26Z [SPARK-3706][PySpark] Cannot run IPython REPL with IPYTHON set to "1" and PYSPARK_PYTHON unset ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org