I think we can deprecate it in 3.x.0 and remove it in Spark 4.0.0. Many people still use Python 2. Also, techincally 2.7 support is not officially dropped yet - https://pythonclock.org/
2018년 9월 17일 (월) 오전 9:31, Aakash Basu <aakash.spark....@gmail.com>님이 작성: > Removing support for an API in a major release makes poor sense, > deprecating is always better. Removal can always be done two - three minor > release later. > > On Mon 17 Sep, 2018, 6:49 AM Felix Cheung, <felixcheun...@hotmail.com> > wrote: > >> I don’t think we should remove any API even in a major release without >> deprecating it first... >> >> >> ------------------------------ >> *From:* Mark Hamstra <m...@clearstorydata.com> >> *Sent:* Sunday, September 16, 2018 12:26 PM >> *To:* Erik Erlandson >> *Cc:* u...@spark.apache.org; dev >> *Subject:* Re: Should python-2 be supported in Spark 3.0? >> >> We could also deprecate Py2 already in the 2.4.0 release. >> >> On Sat, Sep 15, 2018 at 11:46 AM Erik Erlandson <eerla...@redhat.com> >> wrote: >> >>> In case this didn't make it onto this thread: >>> >>> There is a 3rd option, which is to deprecate Py2 for Spark-3.0, and >>> remove it entirely on a later 3.x release. >>> >>> On Sat, Sep 15, 2018 at 11:09 AM, Erik Erlandson <eerla...@redhat.com> >>> wrote: >>> >>>> On a separate dev@spark thread, I raised a question of whether or not >>>> to support python 2 in Apache Spark, going forward into Spark 3.0. >>>> >>>> Python-2 is going EOL <https://github.com/python/devguide/pull/344> at >>>> the end of 2019. The upcoming release of Spark 3.0 is an opportunity to >>>> make breaking changes to Spark's APIs, and so it is a good time to consider >>>> support for Python-2 on PySpark. >>>> >>>> Key advantages to dropping Python 2 are: >>>> >>>> - Support for PySpark becomes significantly easier. >>>> - Avoid having to support Python 2 until Spark 4.0, which is likely >>>> to imply supporting Python 2 for some time after it goes EOL. >>>> >>>> (Note that supporting python 2 after EOL means, among other things, >>>> that PySpark would be supporting a version of python that was no longer >>>> receiving security patches) >>>> >>>> The main disadvantage is that PySpark users who have legacy python-2 >>>> code would have to migrate their code to python 3 to take advantage of >>>> Spark 3.0 >>>> >>>> This decision obviously has large implications for the Apache Spark >>>> community and we want to solicit community feedback. >>>> >>>> >>>