[
https://issues.apache.org/jira/browse/SPARK-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-1134:
---------------------------------
Affects Version/s: 0.9.1
> ipython won't run standalone python script
> ------------------------------------------
>
> Key: SPARK-1134
> URL: https://issues.apache.org/jira/browse/SPARK-1134
> Project: Spark
> Issue Type: Bug
> Components: PySpark
> Affects Versions: 0.9.0, 0.9.1
> Reporter: Diana Carroll
> Assignee: Diana Carroll
> Labels: pyspark
>
> Using Spark 0.9.0, python 2.6.6, and ipython 1.1.0.
> The problem: If I want to run a python script as a standalone app, the docs
> say I should execute the command "pyspark myscript.py". This works as long
> as IPYTHON=0. But if IPYTHON=1 this doesn't work.
> This problem arose for me because I tried to save myself typing by setting
> IPYTHON=1 in my shell profile script. Which then meant I was unable to
> execute pyspark standalone scripts.
> My analysis:
> in the pyspark script, command line arguments are simply ignored if ipython
> is used:
> {code}if [[ "$IPYTHON" = "1" ]] ; then
> exec ipython $IPYTHON_OPTS
> else
> exec "$PYSPARK_PYTHON" "$@"
> fi{code}
> I thought I could get around this by changing the script to pass $@.
> However, this doesn't work: doing so results in an error saying multiple
> spark contexts can't be run at once.
> This is because of a feature?/bug? of ipython related to the PYTHONSTARTUP
> environment variable. the pyspark script sets this variable to point to the
> python/shell.py script, which initializes the Spark Context. In regular
> python, the PYTHONSTARTUP script runs ONLY if python is invoked in
> interactive mode; if run with a script, it ignores the variable. iPython
> runs that script every time, regardless. Which means it will always execute
> Spark's shell.py script to initialize the spark context even when it was
> invoked with a script.
> Proposed solution:
> short term: add this information to the Spark docs regarding iPython.
> Something like "Note, iPython can only be used interactively. Use regular
> Python to execute pyspark script files."
> long term: change the pyspark script to tell if arguments are passed in; if
> so, just call python instead of pyspark, or don't set the PYTHONSTARTUP
> variable? Or maybe fix shell.py to detect if it's being invoked in
> non-interactively and not initialize sc.
--
This message was sent by Atlassian JIRA
(v6.2#6252)