Some good news on  Python 3.x support: thanks to +David Song
<wintermel...@google.com> and +Yifan Zou <yifan...@google.com> we now have
Python 3.8 on Jenkins, and can start working on adding Python 3.8 support
to Beam (BEAM-8494).

One interesting variable that has not being mentioned is what versions of
> python 3
> are available to users via their distribution channels (the linux
> distributions they use to develop/run the pipelines).


Good point. Looking at Ubuntu 16.04, which comes with Python 3.5.2, we can
see that  the end-of-life for 16.04 is in 2024, end-of-support is April
2021 [1]. Both of these dates are beyond the announced Python 3.5 EOL in
September 2020 [2]. I think it would be difficult for Beam to keep Py3.5
support until these EOL dates, and users of systems that stock old versions
of Python have viable workarounds:
- install a newer version of Python interpreter via pyenv[3], from sources,
or from alternative repositories.
- use a docker container that comes with a newer version of interpreter.
- use older versions of Beam.

We didn't receive feedback from user@ on how long 3.x versions on the
lower/higher end of the range should stay supported.  I would suggest for
now that we plan to support all Python 3.x versions that were released and
did not reach EOL. We can discuss exceptions to this rule on a case-by-case
basis, evaluating any maintenance burden to continue support, or stop early.

We should now focus on adjusting our Python test infrastructure to make it
easy to split 3.5, 3.6, 3.7, 3.8  suites into high-priority and
low-priority suites according to the Python version. Ideally, we
should make it easy to change which versions are high/low priority without
having to change all the individual test suites, and without losing test
coverage signal.

[1] https://wiki.ubuntu.com/Releases
[2] https://devguide.python.org/#status-of-python-branches
[3] https://github.com/pyenv/pyenv/blob/master/README.md

On Fri, Feb 28, 2020 at 1:25 AM Ismaël Mejía <ieme...@gmail.com> wrote:

> One interesting variable that has not being mentioned is what versions of
> python
> 3 are available to users via their distribution channels (the linux
> distributions they use to develop/run the pipelines).
>
> - RHEL 8 users have python 3.6 available
> - RHEL 7 users have python 3.6 available
> - Debian 10/Ubuntu 18.04 users have python 3.7/3.6 available
> - Debian 9/Ubuntu 16.04 users have python 3.5 available
>

> We should consider this when we evaluate future support removals.
>
> Given  that the distros that support python 3.5 are ~4y old and since
> python 3.5
> is also losing LTS support soon is probably ok to not support it in Beam
> anymore as Robert suggests.
>
>
> On Thu, Feb 27, 2020 at 3:57 AM Valentyn Tymofieiev <valen...@google.com>
> wrote:
>
>> Thanks everyone for sharing your perspectives so far. It sounds like we
>> can mitigate the cost of test infrastructure by having:
>> - a selection of (fast) tests that we will want to run against all Python
>> versions we support.
>> - high priority Python versions, which we will test extensively.
>> - infrequent postcommit test that exercise low-priority versions.
>> We will need test infrastructure improvements to have the flexibility of
>> designating versions of high-pri/low-pri and minimizing efforts requiring
>> adopting a new version.
>>
>> There is still a question of how long we want to support old Py3.x
>> versions. As mentioned above, I think we should not support them beyond EOL
>> (5 years after a release). I wonder if that is still too long. The cost of
>> supporting a version may include:
>>  - Developing against older Python version
>>  - Release overhead (building & storing containers, wheels, doing release
>> validation)
>>  - Complexity / development cost to support the quirks of the minor
>> versions.
>>
>> We can decide to drop support, after, say, 4 years, or after usage drops
>> below a threshold, or decide on a case-by-case basis. Thoughts? Also asked
>> for feedback on user@ [1]
>>
>> [1]
>> https://lists.apache.org/thread.html/r630a3b55aa8e75c68c8252ea6f824c3ab231ad56e18d916dfb84d9e8%40%3Cuser.beam.apache.org%3E
>>
>> On Wed, Feb 26, 2020 at 5:27 PM Robert Bradshaw <rober...@google.com>
>> wrote:
>>
>>> On Wed, Feb 26, 2020 at 5:21 PM Valentyn Tymofieiev <valen...@google.com>
>>> wrote:
>>> >
>>> > > +1 to consulting users.
>>> > I will message user@ as well and point to this thread.
>>> >
>>> > > I would propose getting in warnings about 3.5 EoL well ahead of time.
>>> > I think we should document on our website, and  in the code (warnings)
>>> that users should not expect SDKs to be supported in Beam beyond the EOL.
>>> If we want to have flexibility to drop support earlier than EOL, we need to
>>> be more careful with messaging because users might otherwise expect that
>>> support will last until EOL, if we mention EOL date.
>>>
>>> +1
>>>
>>> > I am hoping that we can establish a consensus for when we will be
>>> dropping support for a version, so that we don't have to discuss it on a
>>> case by case basis in the future.
>>> >
>>> > > I think it would makes sense to add support for 3.8 right away (or
>>> at least get a good sense of what work needs to be done and what our
>>> dependency situation is like)
>>> > https://issues.apache.org/jira/browse/BEAM-8494 is a starting point.
>>> I tried 3.8 a while ago some dependencies were not able to install, checked
>>> again just now. SDK is "installable" after minor changes. Some tests don't
>>> pass. BEAM-8494 does not have an owner atm, and if anyone is interested I'm
>>> happy to give further pointers and help get started.
>>> >
>>> > > For the 3.x series, I think we will get the most signal out of the
>>> lowest and highest version, and can get by with smoke tests +
>>> > infrequent post-commits for the ones between.
>>> >
>>> > > I agree with having low-frequency tests for low-priority versions.
>>> Low-priority versions could be determined according to least usage.
>>> >
>>> > These are good ideas. Do you think we will want to have an ability  to
>>> run some (inexpensive) tests for all versions  frequently (on presubmits),
>>> or this is extra complexity that can be avoided? I am thinking about type
>>> inference for example. Afaik inference logic is very sensitive to the
>>> version. Would it be acceptable to catch  errors there in infrequent
>>> postcommits or an early signal will be preferred?
>>>
>>> This is a good example--the type inference tests are sensitive to
>>> version (due to using internal details and relying on the
>>> still-evolving typing module) but also run in ~15 seconds. I think
>>> these should be in precommits. We just don't need to run every test
>>> for every version.
>>>
>>> > On Wed, Feb 26, 2020 at 5:17 PM Kyle Weaver <kcwea...@google.com>
>>> wrote:
>>> >>
>>> >> Oh, I didn't see Robert's earlier email:
>>> >>
>>> >> > Currently 3.5 downloads sit at 3.7%, or about
>>> >> > 20% of all Python 3 downloads.
>>> >>
>>> >> Where did these numbers come from?
>>> >>
>>> >> On Wed, Feb 26, 2020 at 5:15 PM Kyle Weaver <kcwea...@google.com>
>>> wrote:
>>> >>>
>>> >>> > I agree with having low-frequency tests for low-priority versions.
>>> >>> > Low-priority versions could be determined according to least usage.
>>> >>>
>>> >>> +1. While the difference may not be as great between, say, 3.6 and
>>> 3.7, I think that if we had to choose, it would be more useful to test the
>>> versions folks are actually using the most. 3.5 only has about a third of
>>> the Docker pulls of 3.6 or 3.7 [1]. Does anyone have other usage statistics
>>> we can consult?
>>> >>>
>>> >>> [1] https://hub.docker.com/search?q=apachebeam%2Fpython&type=image
>>> >>>
>>> >>> On Wed, Feb 26, 2020 at 5:00 PM Ruoyun Huang <ruo...@google.com>
>>> wrote:
>>> >>>>
>>> >>>> I feel 4+ versions take too long to run anything.
>>> >>>>
>>> >>>> would vote for lowest + highest,  2 versions.
>>> >>>>
>>> >>>> On Wed, Feb 26, 2020 at 4:52 PM Udi Meiri <eh...@google.com> wrote:
>>> >>>>>
>>> >>>>> I agree with having low-frequency tests for low-priority versions.
>>> >>>>> Low-priority versions could be determined according to least usage.
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> On Wed, Feb 26, 2020 at 4:06 PM Robert Bradshaw <
>>> rober...@google.com> wrote:
>>> >>>>>>
>>> >>>>>> On Wed, Feb 26, 2020 at 3:29 PM Kenneth Knowles <k...@apache.org>
>>> wrote:
>>> >>>>>> >
>>> >>>>>> > Are these divergent enough that they all need to consume
>>> testing resources? For example can lower priority versions be daily runs or
>>> some such?
>>> >>>>>>
>>> >>>>>> For the 3.x series, I think we will get the most signal out of the
>>> >>>>>> lowest and highest version, and can get by with smoke tests +
>>> >>>>>> infrequent post-commits for the ones between.
>>> >>>>>>
>>> >>>>>> > Kenn
>>> >>>>>> >
>>> >>>>>> > On Wed, Feb 26, 2020 at 3:25 PM Robert Bradshaw <
>>> rober...@google.com> wrote:
>>> >>>>>> >>
>>> >>>>>> >> +1 to consulting users. Currently 3.5 downloads sit at 3.7%,
>>> or about
>>> >>>>>> >> 20% of all Python 3 downloads.
>>> >>>>>> >>
>>> >>>>>> >> I would propose getting in warnings about 3.5 EoL well ahead
>>> of time,
>>> >>>>>> >> at the very least as part of the 2.7 warning.
>>> >>>>>> >>
>>> >>>>>> >> Fortunately, supporting multiple 3.x versions is significantly
>>> easier
>>> >>>>>> >> than spanning 2.7 and 3.x. I would rather not impose an
>>> ordering on
>>> >>>>>> >> dropping 3.5 and adding 3.8 but consider their merits
>>> independently.
>>> >>>>>> >>
>>> >>>>>> >>
>>> >>>>>> >> On Wed, Feb 26, 2020 at 3:16 PM Kyle Weaver <
>>> kcwea...@google.com> wrote:
>>> >>>>>> >> >
>>> >>>>>> >> > 5 versions is too many IMO. We've had issues with Python
>>> precommit resource usage in the past, and adding another version would
>>> surely exacerbate those issues. And we have also already had to leave out
>>> certain features on 3.5 [1]. Therefore, I am in favor of dropping 3.5
>>> before adding 3.8. After dropping Python 2 and adding 3.8, that will leave
>>> us with the latest three minor versions (3.6, 3.7, 3.8), which I think is
>>> closer to the "sweet spot." Though I would be interested in hearing if
>>> there are any users who would prefer we continue supporting 3.5.
>>> >>>>>> >> >
>>> >>>>>> >> > [1]
>>> https://github.com/apache/beam/blob/8658b95545352e51f35959f38334f3c7df8b48eb/sdks/python/apache_beam/runners/portability/flink_runner.py#L55
>>> >>>>>> >> >
>>> >>>>>> >> > On Wed, Feb 26, 2020 at 3:00 PM Valentyn Tymofieiev <
>>> valen...@google.com> wrote:
>>> >>>>>> >> >>
>>> >>>>>> >> >> I would like to start a discussion about identifying a
>>> guideline for answering questions like:
>>> >>>>>> >> >>
>>> >>>>>> >> >> 1. When will Beam support a new Python version (say, Python
>>> 3.8)?
>>> >>>>>> >> >> 2. When will Beam drop support for an old Python version
>>> (say, Python 3.5)?
>>> >>>>>> >> >> 3. How many Python versions should we aim to support
>>> concurrently (investigate issues, have continuous integration tests)?
>>> >>>>>> >> >> 4. What comes first: adding support for a new version (3.8)
>>> or deprecating older one (3.5)? This may affect the max load our test
>>> infrastructure needs to sustain.
>>> >>>>>> >> >>
>>> >>>>>> >> >> We are already getting requests for supporting Python 3.8
>>> and there were some good reasons[1] to drop support for Python 3.5 (at
>>> least, early versions of 3.5). Answering these questions would help set
>>> expectations in Beam user community, Beam dev community, and  may help us
>>> establish resource requirements for test infrastructure and plan efforts.
>>> >>>>>> >> >>
>>> >>>>>> >> >> PEP-0602 [2] establishes a yearly release cycle for Python
>>> versions starting from 3.9. Each release is a long-term support release and
>>> is supported for 5 years: first 1.5 years allow for general bug fix
>>> support, remaining 3.5 years have security fix support.
>>> >>>>>> >> >>
>>> >>>>>> >> >> At every point, there may be up to 5 Python minor versions
>>> that did not yet reach EOL, see "Release overlap with 12 month diagram"
>>> [3]. We can try to support all of them, but that may come at a cost of
>>> velocity: we will have more tests to maintain, and we will have to develop
>>> Beam against a lower version for a longer period. Supporting less versions
>>> will have implications for user experience. It also may be difficult to
>>> ensure support of the most recent version early, since our  dependencies
>>> (e.g. picklers) may not be supporting them yet.
>>> >>>>>> >> >>
>>> >>>>>> >> >> Currently we support 4 Python versions (2.7, 3.5, 3.6, 3.7).
>>> >>>>>> >> >>
>>> >>>>>> >> >> Is 4 versions a sweet spot? Too much? Too little? What do
>>> you think?
>>> >>>>>> >> >>
>>> >>>>>> >> >> [1]
>>> https://github.com/apache/beam/pull/10821#issuecomment-590167711
>>> >>>>>> >> >> [2] https://www.python.org/dev/peps/pep-0602/
>>> >>>>>> >> >> [3] https://www.python.org/dev/peps/pep-0602/#id17
>>>
>>

Reply via email to