Interesting findings. When researching Dataflow Python usage with internal
telemetry, I see that Python 3.11 has slightly more usage than Python 3.8.
When I exclude Dev SDKs (this might also exclude some Google-internal users
who use bleeding-edge SDKs), Python 3.8 reaches to the top. If I exclude
Google Dynamic "FLEX" templates, the following become top 3:

Apache Beam Python 3.9 SDK
24.40%

Apache Beam Python 3.7 SDK
23.34%

Apache Beam Python 3.8 SDK
21.63%

This might be explained by the fact that the default "Python3" flex
template image referenced in the docs (at
https://cloud.google.com/dataflow/docs/guides/templates/using-flex-templates#python_3)
is Python 3.8.

> On the other hand, I do like the idea of letting the Python EoL cycle
drive our own supported versions.

+1. As much as I don't like force upgrades, it won't be sustainable long
term to keep versions indefinitely. I don't anticipate any blockers for
switching Python 3.8 to Python 3.9.

> For many workflows like our unit test suites this is not a large change;
the Python version matrix simply omits 3.8 and runs on the remaining python
versions as expected. This is more complicated for a number of workflows
that currently only run on 3.8 or both 3.8 and 3.12, as GitHub will not run
the updated actions in the main repository until the PR updating them is
submitted.

Yes, that's a known inconvenience. I believe this can be worked around by
pushing the changes to a branch on main repo, and then manually triggering
a GHA workflow from that branch, if you want to be really careful. I think
we have this documented somewhere, but I couldn't quickly find it. @Danny
McCormick <dannymccorm...@google.com> might have a link.

Merging and iterating sounds good to me if we can quickly roll back/fix
forward changes to not make PRs blocked due to tests not passing.

We also set the default Python version in
https://github.com/apache/beam/blob/9c0a9503ebd59778d488dcfff7fb9417a808152b/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L2960
that might affect some workflows.

> To Robert Bradshaw's point, I wouldn't necessarily be opposed to pushing
out this process to 2.61.0.
As long as we don't add a new version before remove an existing one,
probably no significant difference for us.

Our dependencies (like numpy, pandas, etc) are definitely dropping Python
3.8 support, usually ahead of us. Some Google Cloud Python Client libraries
are planning to drop Python 3.8 support after EOL as well.

On Mon, Aug 26, 2024 at 11:17 AM Jack McCluskey via dev <dev@beam.apache.org>
wrote:

> To Robert Bradshaw's point, I wouldn't necessarily be opposed to pushing
> out this process to 2.61.0. That does give more time to validate some of
> the actions changes and let us warn users about the drop in 3.8 support in
> a release. Admittedly a major motivator for moving off of 3.8 at EoL is so
> I can do some overhauling of the type hinting code, as 3.8 is the last
> version where PEP-585 type hints are not supported by default (some context
> for this is available on my Current State of Beam Python Type Hinting doc
> <https://s.apache.org/beam-python-type-hinting-overview> from last
> November.) But that isn't necessarily urgent work as far as users are
> concerned.
>
> There's an argument for trying to keep our documentation and tutorials
> pointing at relatively recent versions of Beam, but that's probably best
> left as a best-effort type thing for now.
>
> On Mon, Aug 26, 2024 at 1:41 PM Robert Burke <lostl...@apache.org> wrote:
>
>> A minor point but often when onboarding, folks will try things  verbatim
>> from the website and documentation:
>>
>>
>> https://github.com/search?q=repo%3Aapache%2Fbeam+python3.7+lang%3AMarkdown+&type=code
>>
>> Granted, the most popular combo there was not present in this search, so
>> it's probably not terribly significant, compared to the reason Robert is
>> guessing.
>>
>> Dunno what we can do about that without going all out in specifying
>> templated versions to use in our various docs. (That has the different
>> problem of ensuring everything being described actually works as typed out,
>> and we are not set up to efficiently validate that for every release.)
>>
>> On 2024/08/26 17:30:23 Robert Bradshaw via dev wrote:
>> > So, 3.8 remains the most popular python version per pypi:
>> > https://pypistats.org/packages/apache-beam
>> >
>> > Breaking down by Beam version over the last 90 days we get
>> >
>> >
>> https://docs.google.com/spreadsheets/d/1-PPxZHs17aXvXgdl439tF7IqIs0XUxtDbDxGYcBg92I
>> >
>> > Which shows that this remains true even for the latest Beam releases.
>> > (Interestingly, one of the most popular combinations is the Python 3.7 +
>> > Beam 2.48. I wonder if people are holding off upgrading Beam due to
>> Python
>> > 3.7 being dropped...)
>> >
>> > Of course, the relationship between pypi downloads and actual customer
>> > usage is not 1:1, but is likely directional at least.
>> >
>> > On the other hand, I do like the idea of letting the Python EoL cycle
>> drive
>> > our own supported versions. Given that 3.8 EoL is in October and our
>> > release is (hopefully) also in October, what if instead we planned on
>> > making 2.60 (tentatively) the last officially supported 3.8 release
>> instead
>> > of the release in which we drop 3.8 and then see what the stats say once
>> > Python is officially EoL. Yes, we could just drop it if that's the
>> > consensus, but given these usage numbers I don't think the case is so
>> clear
>> > cut.
>> >
>> > We could also look at what our dependencies are doing. And if supporting
>> > 3.8 becomes difficult (e.g. is it being removed from github actions?)
>> > that's another reason to do so.
>> >
>> >
>> > [image: Skärmavbild 2024-08-26 kl. 10.08.09 fm.png]
>> >
>> >
>> >
>> > On Mon, Aug 26, 2024 at 9:42 AM Robert Burke <rob...@frantil.com>
>> wrote:
>> >
>> > > I'd take care only relying on the most recent release (as much as it
>> > > supports the consensus point). The most recent beam version is
>> inherently
>> > > going to have smaller and less consistent numbers, vs N-1 or N-2,
>> since
>> > > only the most keen or in need updates immediately.
>> > >
>> > > On Mon, Aug 26, 2024, 9:27 AM Danny McCormick via dev <
>> dev@beam.apache.org>
>> > > wrote:
>> > >
>> > >> Was about to respond, Rebo you beat me to it! I agree DockerHub is
>> the
>> > >> right thing to look at since Pypi reporting isn't awesome, I think we
>> > >> should only look at the most recent versions though, since 3.8 will
>> work
>> > >> for old versions forever.
>> > >>
>> > >> For 2.58.0 last month (partial month results), I see:
>> > >>
>> > >> "Repo","Unique IPs","Pull by tag","Pull by digest","Version check"
>> > >> "beam_python312_sdk",151,70,0,410
>> > >> "beam_python311_sdk",151,64,0,360
>> > >> "beam_python310_sdk",40,97,0,13
>> > >> "beam_python3.9_sdk",18,388,0,14
>> > >> "beam_python3.8_sdk",36,97,0,2
>> > >>
>> > >> So it was <10% of pulls (including our automation as Rebo pointed
>> out)
>> > >>
>> > >> I'll join Jack, Kenn, and Rebo and agree dropping support is the
>> right
>> > >> thing here. The plan SGTM as well.
>> > >>
>> > >> Thanks,
>> > >> Danny
>> > >>
>> > >> On Mon, Aug 26, 2024 at 5:21 PM Robert Burke <rob...@frantil.com>
>> wrote:
>> > >>
>> > >>> As an approximation we can use the docker container pulls at least.
>> > >>>
>> > >>>
>> > >>> Py version : Pulls last week
>> > >>>
>> > >>> 3.8:  7476
>> > >>> 3.9:  1,259
>> > >>> 3.10: 6169
>> > >>> 3.11: 2999
>> > >>> 3.12: 241
>> > >>>
>> > >>> 3.7: 395
>> > >>> 3.6: 241
>> > >>> 3.4: 156
>> > >>> 2.7: 188
>> > >>>
>> > >>> But note that any of our automation for 3.8 that pulls containers
>> would
>> > >>> impact these result too.
>> > >>>
>> > >>> I will note that Beam dropping 3.8 support shouldn't be a problem
>> given
>> > >>> the general end of support of 3.8.
>> > >>>
>> > >>> Users can always upgrade their python version separately from the
>> Beam
>> > >>> version, and then update the Beam version. Ultimately, the cost of
>> the
>> > >>> latest and greatest version, is staying up to date.
>> > >>>
>> > >>>
>> > >>> On Mon, Aug 26, 2024, 8:24 AM Kenneth Knowles <k...@apache.org>
>> wrote:
>> > >>>
>> > >>>> SGTM
>> > >>>>
>> > >>>> Incidentally I poked around on pypi for a minute but didn't find
>> even
>> > >>>> basic download analytics. Do we have data about usage of Python
>> versions?
>> > >>>> (this is not pushback - I'm all for turning things down on a
>> natural pace
>> > >>>> (or faster!); I'm just even *more* for having data around it)
>> > >>>>
>> > >>>> Kenn
>> > >>>>
>> > >>>> On Mon, Aug 26, 2024 at 10:59 AM Jack McCluskey via dev <
>> > >>>> dev@beam.apache.org> wrote:
>> > >>>>
>> > >>>>> Hey everyone,
>> > >>>>>
>> > >>>>> With Python 3.8 reaching end-of-life in October, I've started the
>> work
>> > >>>>> of removing support in the Beam repository. The aim is to target
>> Beam
>> > >>>>> release 2.60.0 for this, since the expected release cut date is on
>> > >>>>> October 2nd, 2024. The start of this effort is at
>> > >>>>> https://github.com/apache/beam/pull/32283/, updating our GitHub
>> > >>>>> Actions workflows. For many workflows like our unit test suites
>> this is not
>> > >>>>> a large change; the Python version matrix simply omits 3.8 and
>> runs on the
>> > >>>>> remaining python versions as expected. This is more complicated
>> for a
>> > >>>>> number of workflows that currently only run on 3.8 or both 3.8
>> and 3.12, as
>> > >>>>> GitHub will not run the updated actions in the main repository
>> until the PR
>> > >>>>> updating them is submitted. This can already be seen in some
>> workflow runs
>> > >>>>> on the PR where Python 3.8 is no longer being installed in the
>> runner
>> > >>>>> environment, leading to failures.
>> > >>>>>
>> > >>>>> The current plan is to do as much validation of the new workflow
>> files
>> > >>>>> as I can before the above PR is submitted (hopefully the week
>> after Beam
>> > >>>>> Summit,) then focus on getting any potential workflow breakages
>> resolved
>> > >>>>> before removing the core Python 3.8 support from the package.
>> There may be
>> > >>>>> some instability with our workflows, and I will try my best to
>> resolve
>> > >>>>> things as they pop up. This is the first Python version to have
>> support
>> > >>>>> dropped since we migrated to GitHub Actions, so there's going to
>> be a
>> > >>>>> decent amount of trial and error as we navigate this. That said,
>> if you
>> > >>>>> notice problems please let me know! Either file a standalone
>> issue and tag
>> > >>>>> me on it (@jrmccluskey) or leave a comment on
>> > >>>>> https://github.com/apache/beam/issues/31192 so I can take a look.
>> > >>>>>
>> > >>>>> Thanks,
>> > >>>>>
>> > >>>>> Jack McCluskey
>> > >>>>>
>> > >>>>> --
>> > >>>>>
>> > >>>>>
>> > >>>>> Jack McCluskey
>> > >>>>> SWE - DataPLS PLAT/ Dataflow ML
>> > >>>>> RDU
>> > >>>>> jrmcclus...@google.com
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> >
>>
>

Reply via email to