best practices machine learning with python 2 or 3?

Andy Davidson Tue, 03 Nov 2015 10:34:16 -0800

I am fairly new to python and am starting a new project that will want to
make use of Spark and the python machine learning libraries (matplotlib,
pandas, ) . I noticed that the spark-c2 script set up my AWS cluster with
python 2.6 and 2.7


http://spark.apache.org/docs/latest/programming-guide.html#linking-with-spar
k

"Spark 1.5.1 works with Python 2.6+ or Python 3.4+. It can use the standard
CPython interpreter, so C libraries like NumPy can be used. It also works
with PyPy 2.3+²

" PySpark works with IPython 1.0.0 and later.²


I realize there are a lot of legacy python packages that are probably
vectorized and not easy to port.

What would you recommend?

I assume if I wanted to use python 3 I would need to install it on all the
works and master. And follow the direction in linking-with-spark to cause it
to use the correct version of python

(of course I realize I need to install 3rd party packages on all the works)

Kind regards

Andy

best practices machine learning with python 2 or 3?

Reply via email to