Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2651#issuecomment-58429854
  
    We don't support IPython 1.0 anymore, so it seems reasonable to make Python 
2.7 the default when using IPython (since IPython 2.0 requires at least Python 
2.7).  It seems like there's a growing list of use-cases that we'd like to 
support:
    
    1. The same custom Python version (non IPython) used everywhere (e.g. I 
want to use PyPy).
    2. Stock IPython (`ipython`) with a different Python version on the workers.
    3. Custom IPython (#2167) with a different Python version on the workers.
    4. IPython (custom or stock) everywhere, including the workers.
    
    In Spark 1.1, we support 1 and 2 (via `IPYTHON=1`), but not 3 or 4.  In 
#2167, we tried to add support for 3 (Custom IPython) but ended up actually 
providing 4 (which was broken until #2554 and this PR).  
    
    For 3, we need a way to specify the driver's Python executable 
independently from the worker's executable.  Currently, `PYSPARK_PYTHON` 
affects all Python processes.  What if we added a `PYSPARK_DRIVER_PYTHON` 
option that only affected the driver Python (and which defaults to 
`PYSPARK_PYTHON` if unset)?  I think this would provide enough mechanism to 
support all four use-cases.
    
    As far as defaults are concerned, maybe we could try `which python2.7` to 
check whether Python 2.7 is installed, and fall back to `python` if it's not 
available (this would only be used if PYSPARK_PYTHON) wasn't set.  I've noticed 
that certain programs run _way_ faster under 2.7 (programs that use `json`, for 
example), so we should probably try to make `python2.7` the default if it's 
installed.
    
    Does this sound like a reasonable approach?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to