Re: PySpark: preference for Python 2.7 or Python 3.5?
On 9/2/16 3:47 AM, Felix Cheung wrote: There is an Anaconda parcel one could readily install on CDH https://docs.continuum.io/anaconda/cloudera As Sean says it is Python 2.7.x. Spark should work for both 2.7 and 3.5. Yes, I'm actually an engineer at Continuum, so I know the Anaconda parcel pretty well. It is more a question of whether CDH and Spark "work" better with PySpark on Python 2.7 or Python 3.5. My sense was "you choose: both are fine", but I wanted to ask here before committing to going down one path or another. Thanks, Ian
PySpark: preference for Python 2.7 or Python 3.5?
I have the option of running PySpark with Python 2.7 or Python 3.5. I am fairly expert with Python and know the Python-side history of the differences. All else being the same, I have a preference for Python 3.5. I'm using CDH 5.8 and I'm wondering if that biases whether I should proceed with PySpark on top of Python 2.7 or 3.5. Opinions? Does Cloudera have an official (or unofficial) position on this? Thanks, Ian ___ Ian Stokes-Rees Computational Scientist Continuum Analytics <http://continuum.io> @ijstokes Twitter <http://twitter.com/ijstokes> LinkedIn <http://linkedin.com/in/ijstokes> Github <http://github.com/ijstokes>617.942.0218