On a separate dev@spark thread, I raised a question of whether or not to
support python 2 in Apache Spark, going forward into Spark 3.0.

Python-2 is going EOL <https://github.com/python/devguide/pull/344> at the
end of 2019. The upcoming release of Spark 3.0 is an opportunity to make
breaking changes to Spark's APIs, and so it is a good time to consider
support for Python-2 on PySpark.

Key advantages to dropping Python 2 are:

   - Support for PySpark becomes significantly easier.
   - Avoid having to support Python 2 until Spark 4.0, which is likely to
   imply supporting Python 2 for some time after it goes EOL.

(Note that supporting python 2 after EOL means, among other things, that
PySpark would be supporting a version of python that was no longer
receiving security patches)

The main disadvantage is that PySpark users who have legacy python-2 code
would have to migrate their code to python 3 to take advantage of Spark 3.0

This decision obviously has large implications for the Apache Spark
community and we want to solicit community feedback.

Reply via email to