Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

Valentyn Tymofieiev Wed, 26 Feb 2020 17:22:28 -0800

> +1 to consulting users.
I will message user@ as well and point to this thread.


> I would propose getting in warnings about 3.5 EoL well ahead of time.
I think we should document on our website, and  in the code (warnings) that
users should not expect SDKs to be supported in Beam beyond the EOL. If we
want to have flexibility to drop support earlier than EOL, we need to be
more careful with messaging because users might otherwise expect that
support will last until EOL, if we mention EOL date.

I am hoping that we can establish a consensus for when we will be dropping
support for a version, so that we don't have to discuss it on a case by
case basis in the future.

> I think it would makes sense to add support for 3.8 right away (or at
least get a good sense of what work needs to be done and what our
dependency situation is like)
https://issues.apache.org/jira/browse/BEAM-8494 is a starting point. I
tried 3.8 a while ago some dependencies were not able to install, checked
again just now. SDK is "installable" after minor changes. Some tests don't
pass. BEAM-8494 does not have an owner atm, and if anyone is interested I'm
happy to give further pointers and help get started.

> For the 3.x series, I think we will get the most signal out of the lowest
and highest version, and can get by with smoke tests +
infrequent post-commits for the ones between.

> I agree with having low-frequency tests for low-priority versions.
Low-priority versions could be determined according to least usage.

These are good ideas. Do you think we will want to have an ability  to run
some (inexpensive) tests for all versions  frequently (on presubmits), or
this is extra complexity that can be avoided? I am thinking about type
inference for example. Afaik inference logic is very sensitive to the
version. Would it be acceptable to catch  errors there in infrequent
postcommits or an early signal will be preferred?

On Wed, Feb 26, 2020 at 5:17 PM Kyle Weaver <[email protected]> wrote:

> Oh, I didn't see Robert's earlier email:
>
> > Currently 3.5 downloads sit at 3.7%, or about
> > 20% of all Python 3 downloads.
>
> Where did these numbers come from?
>
> On Wed, Feb 26, 2020 at 5:15 PM Kyle Weaver <[email protected]> wrote:
>
>> > I agree with having low-frequency tests for low-priority versions.
>> > Low-priority versions could be determined according to least usage.
>>
>> +1. While the difference may not be as great between, say, 3.6 and 3.7, I
>> think that if we had to choose, it would be more useful to test the
>> versions folks are actually using the most. 3.5 only has about a third
>> of the Docker pulls of 3.6 or 3.7 [1]. Does anyone have other usage
>> statistics we can consult?
>>
>> [1] https://hub.docker.com/search?q=apachebeam%2Fpython&type=image
>>
>> On Wed, Feb 26, 2020 at 5:00 PM Ruoyun Huang <[email protected]> wrote:
>>
>>> I feel 4+ versions take too long to run anything.
>>>
>>> would vote for lowest + highest,  2 versions.
>>>
>>> On Wed, Feb 26, 2020 at 4:52 PM Udi Meiri <[email protected]> wrote:
>>>
>>>> I agree with having low-frequency tests for low-priority versions.
>>>> Low-priority versions could be determined according to least usage.
>>>>
>>>>
>>>>
>>>> On Wed, Feb 26, 2020 at 4:06 PM Robert Bradshaw <[email protected]>
>>>> wrote:
>>>>
>>>>> On Wed, Feb 26, 2020 at 3:29 PM Kenneth Knowles <[email protected]>
>>>>> wrote:
>>>>> >
>>>>> > Are these divergent enough that they all need to consume testing
>>>>> resources? For example can lower priority versions be daily runs or some
>>>>> such?
>>>>>
>>>>> For the 3.x series, I think we will get the most signal out of the
>>>>> lowest and highest version, and can get by with smoke tests +
>>>>> infrequent post-commits for the ones between.
>>>>>
>>>>> > Kenn
>>>>> >
>>>>> > On Wed, Feb 26, 2020 at 3:25 PM Robert Bradshaw <[email protected]>
>>>>> wrote:
>>>>> >>
>>>>> >> +1 to consulting users. Currently 3.5 downloads sit at 3.7%, or
>>>>> about
>>>>> >> 20% of all Python 3 downloads.
>>>>> >>
>>>>> >> I would propose getting in warnings about 3.5 EoL well ahead of
>>>>> time,
>>>>> >> at the very least as part of the 2.7 warning.
>>>>> >>
>>>>> >> Fortunately, supporting multiple 3.x versions is significantly
>>>>> easier
>>>>> >> than spanning 2.7 and 3.x. I would rather not impose an ordering on
>>>>> >> dropping 3.5 and adding 3.8 but consider their merits independently.
>>>>> >>
>>>>> >>
>>>>> >> On Wed, Feb 26, 2020 at 3:16 PM Kyle Weaver <[email protected]>
>>>>> wrote:
>>>>> >> >
>>>>> >> > 5 versions is too many IMO. We've had issues with Python
>>>>> precommit resource usage in the past, and adding another version would
>>>>> surely exacerbate those issues. And we have also already had to leave out
>>>>> certain features on 3.5 [1]. Therefore, I am in favor of dropping 3.5
>>>>> before adding 3.8. After dropping Python 2 and adding 3.8, that will leave
>>>>> us with the latest three minor versions (3.6, 3.7, 3.8), which I think is
>>>>> closer to the "sweet spot." Though I would be interested in hearing if
>>>>> there are any users who would prefer we continue supporting 3.5.
>>>>> >> >
>>>>> >> > [1]
>>>>> https://github.com/apache/beam/blob/8658b95545352e51f35959f38334f3c7df8b48eb/sdks/python/apache_beam/runners/portability/flink_runner.py#L55
>>>>> >> >
>>>>> >> > On Wed, Feb 26, 2020 at 3:00 PM Valentyn Tymofieiev <
>>>>> [email protected]> wrote:
>>>>> >> >>
>>>>> >> >> I would like to start a discussion about identifying a guideline
>>>>> for answering questions like:
>>>>> >> >>
>>>>> >> >> 1. When will Beam support a new Python version (say, Python 3.8)?
>>>>> >> >> 2. When will Beam drop support for an old Python version (say,
>>>>> Python 3.5)?
>>>>> >> >> 3. How many Python versions should we aim to support
>>>>> concurrently (investigate issues, have continuous integration tests)?
>>>>> >> >> 4. What comes first: adding support for a new version (3.8) or
>>>>> deprecating older one (3.5)? This may affect the max load our test
>>>>> infrastructure needs to sustain.
>>>>> >> >>
>>>>> >> >> We are already getting requests for supporting Python 3.8 and
>>>>> there were some good reasons[1] to drop support for Python 3.5 (at least,
>>>>> early versions of 3.5). Answering these questions would help set
>>>>> expectations in Beam user community, Beam dev community, and  may help us
>>>>> establish resource requirements for test infrastructure and plan efforts.
>>>>> >> >>
>>>>> >> >> PEP-0602 [2] establishes a yearly release cycle for Python
>>>>> versions starting from 3.9. Each release is a long-term support release 
>>>>> and
>>>>> is supported for 5 years: first 1.5 years allow for general bug fix
>>>>> support, remaining 3.5 years have security fix support.
>>>>> >> >>
>>>>> >> >> At every point, there may be up to 5 Python minor versions that
>>>>> did not yet reach EOL, see "Release overlap with 12 month diagram" [3]. We
>>>>> can try to support all of them, but that may come at a cost of velocity: 
>>>>> we
>>>>> will have more tests to maintain, and we will have to develop Beam against
>>>>> a lower version for a longer period. Supporting less versions will have
>>>>> implications for user experience. It also may be difficult to ensure
>>>>> support of the most recent version early, since our  dependencies (e.g.
>>>>> picklers) may not be supporting them yet.
>>>>> >> >>
>>>>> >> >> Currently we support 4 Python versions (2.7, 3.5, 3.6, 3.7).
>>>>> >> >>
>>>>> >> >> Is 4 versions a sweet spot? Too much? Too little? What do you
>>>>> think?
>>>>> >> >>
>>>>> >> >> [1]
>>>>> https://github.com/apache/beam/pull/10821#issuecomment-590167711
>>>>> >> >> [2] https://www.python.org/dev/peps/pep-0602/
>>>>> >> >> [3] https://www.python.org/dev/peps/pep-0602/#id17
>>>>>
>>>>

Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

Reply via email to