Thank you all. Python 2, 3.4 and 3.5 are dropped now in the master branch at https://github.com/apache/spark/pull/28957
2020년 7월 3일 (금) 오전 10:01, Hyukjin Kwon <gurwls...@gmail.com>님이 작성: > Thanks Dongjoon. That makes much more sense now! > > 2020년 7월 3일 (금) 오전 12:11, Dongjoon Hyun <dongjoon.h...@gmail.com>님이 작성: > >> Thank you, Hyukjin. >> >> According to the Python community, Python 3.5 is also EOF at 2020-09-13 >> (only two months left). >> >> - https://www.python.org/downloads/ >> >> So, targeting live Python versions at Apache Spark 3.1.0 (December 2020) >> looks reasonable to me. >> >> For old Python versions, we still have Apache Spark 2.4 LTS and also >> Apache Spark 3.0.x will work. >> >> Bests, >> Dongjoon. >> >> >> On Wed, Jul 1, 2020 at 10:50 PM Yuanjian Li <xyliyuanj...@gmail.com> >> wrote: >> >>> +1, especially Python 2 >>> >>> Holden Karau <hol...@pigscanfly.ca> 于2020年7月2日周四 上午10:20写道: >>> >>>> I’m ok with us dropping Python 2, 3.4, and 3.5 in Spark 3.1 forward. It >>>> will be exciting to get to use more recent Python features. The most recent >>>> Ubuntu LTS ships with 3.7, and while the previous LTS ships with 3.5, if >>>> folks really can’t upgrade there’s conda. >>>> >>>> Is there anyone with a large Python 3.5 fleet who can’t use conda? >>>> >>>> On Wed, Jul 1, 2020 at 7:15 PM Hyukjin Kwon <gurwls...@gmail.com> >>>> wrote: >>>> >>>>> Yeah, sure. It will be dropped at Spark 3.1 onwards. I don't think we >>>>> should make such changes in maintenance releases >>>>> >>>>> 2020년 7월 2일 (목) 오전 11:13, Holden Karau <hol...@pigscanfly.ca>님이 작성: >>>>> >>>>>> To be clear the plan is to drop them in Spark 3.1 onwards, yes? >>>>>> >>>>>> On Wed, Jul 1, 2020 at 7:11 PM Hyukjin Kwon <gurwls...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I would like to discuss dropping deprecated Python versions 2, 3.4 >>>>>>> and 3.5 at https://github.com/apache/spark/pull/28957. I assume >>>>>>> people support it in general >>>>>>> but I am writing this to make sure everybody is happy. >>>>>>> >>>>>>> Fokko made a very good investigation on it, see >>>>>>> https://github.com/apache/spark/pull/28957#issuecomment-652022449. >>>>>>> Assuming from the statistics, I think we're pretty safe to drop them. >>>>>>> Also note that dropping Python 2 was actually declared at >>>>>>> https://python3statement.org/ >>>>>>> >>>>>>> Roughly speaking, there are many main advantages by dropping them: >>>>>>> 1. It removes a bunch of hacks we added around 700 lines in >>>>>>> PySpark. >>>>>>> 2. PyPy2 has a critical bug that causes a flaky test, >>>>>>> https://issues.apache.org/jira/browse/SPARK-28358 given my testing >>>>>>> and investigation. >>>>>>> 3. Users can use Python type hints with Pandas UDFs without >>>>>>> thinking about Python version >>>>>>> 4. Users can leverage one latest cloudpickle, >>>>>>> https://github.com/apache/spark/pull/28950. With Python 3.8+ it can >>>>>>> also leverage C pickle. >>>>>>> 5. ... >>>>>>> >>>>>>> So it benefits both users and dev. WDYT guys? >>>>>>> >>>>>>> >>>>>>> -- >>>>>> Twitter: https://twitter.com/holdenkarau >>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>> >>>>> -- >>>> Twitter: https://twitter.com/holdenkarau >>>> Books (Learning Spark, High Performance Spark, etc.): >>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>> >>>