I like that option as a concrete proposal. I think we should circulate it more widely (the users list, twitter poll, at least a new thread...), maybe phrasing it as "is there any reason you couldn't migrate to Python 3 (or stick with an older version of Beam) after 2.23 (due to be cut here in a couple of weeks)?" If there is strong concern/pushback, we could consider holding on for one more release.
On Tue, Jun 16, 2020 at 8:54 AM David Cavazos <dcava...@google.com> wrote: > +1 > > On Mon, Jun 15, 2020 at 6:52 PM Udi Meiri <eh...@google.com> wrote: > >> +1 >> >> On Mon, Jun 15, 2020 at 4:27 PM Ahmet Altay <al...@google.com> wrote: >> >>> As a concrete proposal, could we commit to removing python 2 support by >>> 2.24? In other words, mark the next release 2.23 as the last python 2 >>> compatible Beam version. >>> >>> On Mon, Jun 15, 2020 at 2:09 PM Valentyn Tymofieiev <valen...@google.com> >>> wrote: >>> >>>> Another input here: >>>> >>>> If you opened a Python PR in the last few days, you probably noticed >>>> that our test suites were broken by a transitive dependency of Beam that >>>> dropped python 2 support, but did not declare python_requires>=3 in its >>>> setup.py [1]. This temporarily broke a subset of Beam Py2 users (who did >>>> not explicitly pin the 'rsa' dependency), and still affects Beam >>>> development[2]. >>>> >>>> This is the second time[3] Beam is affected with an issue of this kind, >>>> so support of Python 2 starts to slow down our development, and add toil >>>> for maintainers of packages we depend on (both directly and transitively). >>>> >>>> [1] https://github.com/sybrenstuvel/python-rsa/issues/152 >>>> [2] >>>> https://lists.apache.org/thread.html/r9993b40b0c1cb8682ce56013165d4b80fdde0ee469a73bcb9466ddfb%40%3Cdev.beam.apache.org%3E >>>> [3] https://github.com/hamcrest/PyHamcrest/issues/131 >>>> >>>> On Tue, Jun 9, 2020 at 4:06 PM Ahmet Altay <al...@google.com> wrote: >>>> >>>>> Thank you for re-opening this Valentyn. I am in favor of EOLing py2 >>>>> support sooner than later. The reality is that we will not be effectively >>>>> supporting beam python 2 for a long time while the ecosystem already EOLed >>>>> python 2. That said, a significant chunk (but no longer a majority) of our >>>>> users are still using python 2. Upgrades are painful, it might be >>>>> especially painful nowadays. It would be good to hear counter view points, >>>>> user voices related to this. >>>>> >>>>> On Thu, Jun 4, 2020 at 4:53 PM Valentyn Tymofieiev < >>>>> valen...@google.com> wrote: >>>>> >>>>>> Back at the end of February we decided to revisit this conversation >>>>>> in 3 months. Do folks on this thread have any new input or perspective >>>>>> regarding us balancing "user pain/contributor pain/our ability to >>>>>> continuously test with python 2 in a shifting environment"? >>>>>> >>>>>> Some new information on my end is that we have been seeing steady >>>>>> adoption of Python 3 among Beam Python users in Dataflow, particularly >>>>>> strong adoption among streaming users, and Dataflow is sunsetting Python >>>>>> 2 >>>>>> support for all released Beam SDKs later this year [1]. We will have to >>>>>> remove Python 2 Beam test suites that use Dataflow when Dataflow runner >>>>>> disables Py2 support if this happens before Beam Py2 EOL (when we have to >>>>>> remove all Py2 suites), including performance tests that still use >>>>>> Dataflow >>>>>> on Python 3. >>>>>> >>>>>> I am curious how much motivation there is in the community at this >>>>>> moment to continue Py2 support in Beam, whether any previous Py3 >>>>>> migration >>>>>> blockers were resolved or any new blockers discovered among Beam users. >>>>>> >>>>>> [1] https://cloud.google.com/python/docs/python2-sunset/#dataflow >>>>>> >>>>>> On Fri, May 8, 2020 at 3:52 PM Valentyn Tymofieiev < >>>>>> valen...@google.com> wrote: >>>>>> >>>>>>> That's good news! Thanks for sharing. >>>>>>> >>>>>>> Another datapoint, here are a few of Beam's dependencies that no >>>>>>> longer release new py2 artifacts (I looked at REQUIRED_PACKAGES + aws, >>>>>>> gcp, and interactive extras): >>>>>>> >>>>>>> hdfs >>>>>>> numpy >>>>>>> pyarrow >>>>>>> ipython >>>>>>> >>>>>>> There are more if we include transitive dependencies and test-only >>>>>>> packages. I also remember encountering one issue last month that was >>>>>>> broken >>>>>>> only on Py2, which we had to go back and fix. >>>>>>> >>>>>>> If others have noticed frictions related to ongoing Py2 support or >>>>>>> have updates on previously mentioned Py3 migration blockers, feel free >>>>>>> to >>>>>>> post them. >>>>>>> >>>>>>> On Fri, May 8, 2020 at 9:19 AM Robert Bradshaw <rober...@google.com> >>>>>>> wrote: >>>>>>> >>>>>>>> It hasn't been 3 months yet, but I wanted to call out a milestone >>>>>>>> that >>>>>>>> Python 3 downloads crossed the 50% threshold on pypi, if just >>>>>>>> briefly. >>>>>>>> >>>>>>>> On Thu, Feb 13, 2020 at 12:40 AM Ismaël Mejía <ieme...@gmail.com> >>>>>>>> wrote: >>>>>>>> > >>>>>>>> > > I would suggest re-evaluating this within the next 3 months >>>>>>>> again. We need to balance between user pain/contributor pain/our >>>>>>>> ability to >>>>>>>> continuously test with python 2 in a shifting environment. >>>>>>>> > >>>>>>>> > Good idea for the in 3 months evaluation, at that point also >>>>>>>> distributions will probably be phasing out python2 by default which >>>>>>>> definitely help in this direction. >>>>>>>> > Thanks for updating the roadmap Ahmet >>>>>>>> > >>>>>>>> > >>>>>>>> > On Thu, Feb 13, 2020 at 2:49 AM Ahmet Altay <al...@google.com> >>>>>>>> wrote: >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> On Wed, Feb 12, 2020 at 1:29 AM Ismaël Mejía <ieme...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>> >>>>>>>> >>> I am with Chad on this, we should probably extend it a bit >>>>>>>> more, even if it >>>>>>>> >>> makes us struggle a bit at least we have some workarounds as >>>>>>>> Robert suggests, >>>>>>>> >>> and as Chad said there are still many people playing the python >>>>>>>> 3 catchup game, >>>>>>>> >>> so worth to support those users. >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> >>> But maybe it is worth to evaluate the current state later in >>>>>>>> the year. >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> I would suggest re-evaluating this within the next 3 months >>>>>>>> again. We need to balance between user pain/contributor pain/our >>>>>>>> ability to >>>>>>>> continuously test with python 2 in a shifting environment. >>>>>>>> >> >>>>>>>> >>> >>>>>>>> >>> In the >>>>>>>> >>> meantime can someone please update our Roadmap in the website >>>>>>>> with this info and >>>>>>>> >>> where we are with Python 3 support (it looks not up to date). >>>>>>>> >>> https://beam.apache.org/roadmap/ >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> I made a minor change to update that page ( >>>>>>>> https://github.com/apache/beam/pull/10848). A more comprehensive >>>>>>>> update to that page and linked ( >>>>>>>> https://beam.apache.org/roadmap/python-sdk/#python-3-support) >>>>>>>> would still be welcome. >>>>>>>> >> >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> >>> - Ismaël >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> >>> On Tue, Feb 4, 2020 at 10:49 PM Robert Bradshaw < >>>>>>>> rober...@google.com> wrote: >>>>>>>> >>>> >>>>>>>> >>>> On Tue, Feb 4, 2020 at 12:12 PM Chad Dombrova < >>>>>>>> chad...@gmail.com> wrote: >>>>>>>> >>>> >> >>>>>>>> >>>> >> Not to mention that all the nice work for the type hints >>>>>>>> will have to be redone in the for 3.x. >>>>>>>> >>>> > >>>>>>>> >>>> > Note that there's a tool for automatically converting type >>>>>>>> comments to annotations: https://github.com/ilevkivskyi/com2ann >>>>>>>> >>>> > >>>>>>>> >>>> > So don't let that part bother you. >>>>>>>> >>>> >>>>>>>> >>>> +1, I wouldn't worry about what can be easily automated. >>>>>>>> >>>> >>>>>>>> >>>> > I'm curious what other features you'd like to be using in >>>>>>>> the Beam source that you cannot now. >>>>>>>> >>>> >>>>>>>> >>>> I hit things occasionally, e.g. I just ran into wanting >>>>>>>> keyword-only >>>>>>>> >>>> arguments the other day. >>>>>>>> >>>> >>>>>>>> >>>> >> It seems the faster we drop support the better. >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > I've already gone over my position on this, but a refresher >>>>>>>> for those who care: some of the key vendors that support my industry >>>>>>>> will >>>>>>>> not offer python3-compatible versions of their software until the 4th >>>>>>>> quarter of 2020. If Beam switches to python3-only before that point >>>>>>>> we may >>>>>>>> be forced to stop contributing features (note: I'm the guy who added >>>>>>>> the >>>>>>>> type hints :). Every month you can give us would be greatly >>>>>>>> appreciated. >>>>>>>> >>>> >>>>>>>> >>>> As another data point, we're still 80/20 on Py2/Py3 for >>>>>>>> downloads at >>>>>>>> >>>> PyPi [1] (which I've heard should be taken with a grain of >>>>>>>> salt, but >>>>>>>> >>>> likely isn't totally off). IMHO that ratio needs to be way >>>>>>>> higher for >>>>>>>> >>>> Python 3 to consider dropping Python 2. It's pretty noisy, but >>>>>>>> say it >>>>>>>> >>>> doubles every 3 months that would put us at least mid-year >>>>>>>> before we >>>>>>>> >>>> hit a cross-over point. On the other hand Q4 2020 is probably a >>>>>>>> >>>> stretch. >>>>>>>> >>>> >>>>>>>> >>>> We could consider whether it needs to be an all-or-nothing >>>>>>>> thing as >>>>>>>> >>>> well. E.g. perhaps some features could be Python 3 only sooner >>>>>>>> than >>>>>>>> >>>> the whole codebase. (This would have to be well justified.) >>>>>>>> Another >>>>>>>> >>>> mitigation is that it is possible to mix Python 2 and Python 3 >>>>>>>> in the >>>>>>>> >>>> same pipeline with portability, so if there's a library that >>>>>>>> you need >>>>>>>> >>>> for one DoFn it doesn't mean you have to hold back your whole >>>>>>>> >>>> pipeline. >>>>>>>> >>>> >>>>>>>> >>>> - Robert >>>>>>>> >>>> >>>>>>>> >>>> [1] https://pypistats.org/packages/apache-beam , and that 20% >>>>>>>> may just >>>>>>>> >>>> be a spike. >>>>>>>> >>>>>>>