PySpark work with CPython by default, and you can specify which version of Python to use by:
PYSPARK_PYTHON=path/to/path bin/spark-submit xxx.py When you do the upgrade, you could install python 2.7 on every machine in the cluster, test it with PYSPARK_PYTHON=python2.7 bin/spark-submit xxx.py For YARN, you also need to install python2.7 in every node in the cluster. On Tue, May 19, 2015 at 7:44 AM, YaoPau <[email protected]> wrote: > We're running Python 2.6.6 here but we're looking to upgrade to 2.7.x in a > month. > > Does pyspark work by converting Python into Java Bytecode, or does it run > Python natively? > > And along those lines, if we're running in yarn-client mode, would we have > to upgrade just the edge node version of Python, or every node in the > cluster? > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Does-Python-2-7-have-to-be-installed-on-every-cluster-node-tp22945.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
