I updated Spark website and announced the plan for dropping python 2 support there: http://spark.apache.org/news/plan-for-dropping-python-2-support.html. I will send an announcement email to user@ and dev@. -Xiangrui
On Fri, May 31, 2019 at 10:54 PM Felix Cheung <felixcheun...@hotmail.com> wrote: > Very subtle but someone might take > > “We will drop Python 2 support in a future release in 2020” > > To mean any / first release in 2020. Whereas the next statement indicates > patch release is not included in above. Might help reorder the items or > clarify the wording. > > > ------------------------------ > *From:* shane knapp <skn...@berkeley.edu> > *Sent:* Friday, May 31, 2019 7:38:10 PM > *To:* Denny Lee > *Cc:* Holden Karau; Bryan Cutler; Erik Erlandson; Felix Cheung; Mark > Hamstra; Matei Zaharia; Reynold Xin; Sean Owen; Wenchen Fen; Xiangrui Meng; > dev; user > *Subject:* Re: Should python-2 be supported in Spark 3.0? > > +1000 ;) > > On Sat, Jun 1, 2019 at 6:53 AM Denny Lee <denny.g....@gmail.com> wrote: > >> +1 >> >> On Fri, May 31, 2019 at 17:58 Holden Karau <hol...@pigscanfly.ca> wrote: >> >>> +1 >>> >>> On Fri, May 31, 2019 at 5:41 PM Bryan Cutler <cutl...@gmail.com> wrote: >>> >>>> +1 and the draft sounds good >>>> >>>> On Thu, May 30, 2019, 11:32 AM Xiangrui Meng <men...@gmail.com> wrote: >>>> >>>>> Here is the draft announcement: >>>>> >>>>> === >>>>> Plan for dropping Python 2 support >>>>> >>>>> As many of you already knew, Python core development team and many >>>>> utilized Python packages like Pandas and NumPy will drop Python 2 support >>>>> in or before 2020/01/01. Apache Spark has supported both Python 2 and 3 >>>>> since Spark 1.4 release in 2015. However, maintaining Python 2/3 >>>>> compatibility is an increasing burden and it essentially limits the use of >>>>> Python 3 features in Spark. Given the end of life (EOL) of Python 2 is >>>>> coming, we plan to eventually drop Python 2 support as well. The current >>>>> plan is as follows: >>>>> >>>>> * In the next major release in 2019, we will deprecate Python 2 >>>>> support. PySpark users will see a deprecation warning if Python 2 is used. >>>>> We will publish a migration guide for PySpark users to migrate to Python >>>>> 3. >>>>> * We will drop Python 2 support in a future release in 2020, after >>>>> Python 2 EOL on 2020/01/01. PySpark users will see an error if Python 2 is >>>>> used. >>>>> * For releases that support Python 2, e.g., Spark 2.4, their patch >>>>> releases will continue supporting Python 2. However, after Python 2 EOL, >>>>> we >>>>> might not take patches that are specific to Python 2. >>>>> === >>>>> >>>>> Sean helped make a pass. If it looks good, I'm going to upload it to >>>>> Spark website and announce it here. Let me know if you think we should do >>>>> a >>>>> VOTE instead. >>>>> >>>>> On Thu, May 30, 2019 at 9:21 AM Xiangrui Meng <men...@gmail.com> >>>>> wrote: >>>>> >>>>>> I created https://issues.apache.org/jira/browse/SPARK-27884 to track >>>>>> the work. >>>>>> >>>>>> On Thu, May 30, 2019 at 2:18 AM Felix Cheung < >>>>>> felixcheun...@hotmail.com> wrote: >>>>>> >>>>>>> We don’t usually reference a future release on website >>>>>>> >>>>>>> > Spark website and state that Python 2 is deprecated in Spark 3.0 >>>>>>> >>>>>>> I suspect people will then ask when is Spark 3.0 coming out then. >>>>>>> Might need to provide some clarity on that. >>>>>>> >>>>>> >>>>>> We can say the "next major release in 2019" instead of Spark 3.0. >>>>>> Spark 3.0 timeline certainly requires a new thread to discuss. >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------ >>>>>>> *From:* Reynold Xin <r...@databricks.com> >>>>>>> *Sent:* Thursday, May 30, 2019 12:59:14 AM >>>>>>> *To:* shane knapp >>>>>>> *Cc:* Erik Erlandson; Mark Hamstra; Matei Zaharia; Sean Owen; >>>>>>> Wenchen Fen; Xiangrui Meng; dev; user >>>>>>> *Subject:* Re: Should python-2 be supported in Spark 3.0? >>>>>>> >>>>>>> +1 on Xiangrui’s plan. >>>>>>> >>>>>>> On Thu, May 30, 2019 at 7:55 AM shane knapp <skn...@berkeley.edu> >>>>>>> wrote: >>>>>>> >>>>>>>> I don't have a good sense of the overhead of continuing to support >>>>>>>>> Python 2; is it large enough to consider dropping it in Spark 3.0? >>>>>>>>> >>>>>>>>> from the build/test side, it will actually be pretty easy to >>>>>>>> continue support for python2.7 for spark 2.x as the feature sets won't >>>>>>>> be >>>>>>>> expanding. >>>>>>>> >>>>>>> >>>>>>>> that being said, i will be cracking a bottle of champagne when i >>>>>>>> can delete all of the ansible and anaconda configs for python2.x. :) >>>>>>>> >>>>>>> >>>>>> On the development side, in a future release that drops Python 2 >>>>>> support we can remove code that maintains python 2/3 compatibility and >>>>>> start using python 3 only features, which is also quite exciting. >>>>>> >>>>>> >>>>>>> >>>>>>>> shane >>>>>>>> -- >>>>>>>> Shane Knapp >>>>>>>> UC Berkeley EECS Research / RISELab Staff Technical Lead >>>>>>>> https://rise.cs.berkeley.edu >>>>>>>> >>>>>>> >>> >>> -- >>> Twitter: https://twitter.com/holdenkarau >>> Books (Learning Spark, High Performance Spark, etc.): >>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>> >> > > -- > Shane Knapp > UC Berkeley EECS Research / RISELab Staff Technical Lead > https://rise.cs.berkeley.edu >