Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

Robert Bradshaw Wed, 26 Feb 2020 17:27:28 -0800

On Wed, Feb 26, 2020 at 5:21 PM Valentyn Tymofieiev <valen...@google.com> wrote:
>
> > +1 to consulting users.
> I will message user@ as well and point to this thread.
>
> > I would propose getting in warnings about 3.5 EoL well ahead of time.
> I think we should document on our website, and  in the code (warnings) that 
> users should not expect SDKs to be supported in Beam beyond the EOL. If we 
> want to have flexibility to drop support earlier than EOL, we need to be more 
> careful with messaging because users might otherwise expect that support will 
> last until EOL, if we mention EOL date.


+1

> I am hoping that we can establish a consensus for when we will be dropping 
> support for a version, so that we don't have to discuss it on a case by case 
> basis in the future.
>
> > I think it would makes sense to add support for 3.8 right away (or at least 
> > get a good sense of what work needs to be done and what our dependency 
> > situation is like)
> https://issues.apache.org/jira/browse/BEAM-8494 is a starting point. I tried 
> 3.8 a while ago some dependencies were not able to install, checked again 
> just now. SDK is "installable" after minor changes. Some tests don't pass. 
> BEAM-8494 does not have an owner atm, and if anyone is interested I'm happy 
> to give further pointers and help get started.
>
> > For the 3.x series, I think we will get the most signal out of the lowest 
> > and highest version, and can get by with smoke tests +
> infrequent post-commits for the ones between.
>
> > I agree with having low-frequency tests for low-priority versions. 
> > Low-priority versions could be determined according to least usage.
>
> These are good ideas. Do you think we will want to have an ability  to run 
> some (inexpensive) tests for all versions  frequently (on presubmits), or 
> this is extra complexity that can be avoided? I am thinking about type 
> inference for example. Afaik inference logic is very sensitive to the 
> version. Would it be acceptable to catch  errors there in infrequent 
> postcommits or an early signal will be preferred?

This is a good example--the type inference tests are sensitive to
version (due to using internal details and relying on the
still-evolving typing module) but also run in ~15 seconds. I think
these should be in precommits. We just don't need to run every test
for every version.

> On Wed, Feb 26, 2020 at 5:17 PM Kyle Weaver <kcwea...@google.com> wrote:
>>
>> Oh, I didn't see Robert's earlier email:
>>
>> > Currently 3.5 downloads sit at 3.7%, or about
>> > 20% of all Python 3 downloads.
>>
>> Where did these numbers come from?
>>
>> On Wed, Feb 26, 2020 at 5:15 PM Kyle Weaver <kcwea...@google.com> wrote:
>>>
>>> > I agree with having low-frequency tests for low-priority versions.
>>> > Low-priority versions could be determined according to least usage.
>>>
>>> +1. While the difference may not be as great between, say, 3.6 and 3.7, I 
>>> think that if we had to choose, it would be more useful to test the 
>>> versions folks are actually using the most. 3.5 only has about a third of 
>>> the Docker pulls of 3.6 or 3.7 [1]. Does anyone have other usage statistics 
>>> we can consult?
>>>
>>> [1] https://hub.docker.com/search?q=apachebeam%2Fpython&type=image
>>>
>>> On Wed, Feb 26, 2020 at 5:00 PM Ruoyun Huang <ruo...@google.com> wrote:
>>>>
>>>> I feel 4+ versions take too long to run anything.
>>>>
>>>> would vote for lowest + highest,  2 versions.
>>>>
>>>> On Wed, Feb 26, 2020 at 4:52 PM Udi Meiri <eh...@google.com> wrote:
>>>>>
>>>>> I agree with having low-frequency tests for low-priority versions.
>>>>> Low-priority versions could be determined according to least usage.
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 26, 2020 at 4:06 PM Robert Bradshaw <rober...@google.com> 
>>>>> wrote:
>>>>>>
>>>>>> On Wed, Feb 26, 2020 at 3:29 PM Kenneth Knowles <k...@apache.org> wrote:
>>>>>> >
>>>>>> > Are these divergent enough that they all need to consume testing 
>>>>>> > resources? For example can lower priority versions be daily runs or 
>>>>>> > some such?
>>>>>>
>>>>>> For the 3.x series, I think we will get the most signal out of the
>>>>>> lowest and highest version, and can get by with smoke tests +
>>>>>> infrequent post-commits for the ones between.
>>>>>>
>>>>>> > Kenn
>>>>>> >
>>>>>> > On Wed, Feb 26, 2020 at 3:25 PM Robert Bradshaw <rober...@google.com> 
>>>>>> > wrote:
>>>>>> >>
>>>>>> >> +1 to consulting users. Currently 3.5 downloads sit at 3.7%, or about
>>>>>> >> 20% of all Python 3 downloads.
>>>>>> >>
>>>>>> >> I would propose getting in warnings about 3.5 EoL well ahead of time,
>>>>>> >> at the very least as part of the 2.7 warning.
>>>>>> >>
>>>>>> >> Fortunately, supporting multiple 3.x versions is significantly easier
>>>>>> >> than spanning 2.7 and 3.x. I would rather not impose an ordering on
>>>>>> >> dropping 3.5 and adding 3.8 but consider their merits independently.
>>>>>> >>
>>>>>> >>
>>>>>> >> On Wed, Feb 26, 2020 at 3:16 PM Kyle Weaver <kcwea...@google.com> 
>>>>>> >> wrote:
>>>>>> >> >
>>>>>> >> > 5 versions is too many IMO. We've had issues with Python precommit 
>>>>>> >> > resource usage in the past, and adding another version would surely 
>>>>>> >> > exacerbate those issues. And we have also already had to leave out 
>>>>>> >> > certain features on 3.5 [1]. Therefore, I am in favor of dropping 
>>>>>> >> > 3.5 before adding 3.8. After dropping Python 2 and adding 3.8, that 
>>>>>> >> > will leave us with the latest three minor versions (3.6, 3.7, 3.8), 
>>>>>> >> > which I think is closer to the "sweet spot." Though I would be 
>>>>>> >> > interested in hearing if there are any users who would prefer we 
>>>>>> >> > continue supporting 3.5.
>>>>>> >> >
>>>>>> >> > [1] 
>>>>>> >> > https://github.com/apache/beam/blob/8658b95545352e51f35959f38334f3c7df8b48eb/sdks/python/apache_beam/runners/portability/flink_runner.py#L55
>>>>>> >> >
>>>>>> >> > On Wed, Feb 26, 2020 at 3:00 PM Valentyn Tymofieiev 
>>>>>> >> > <valen...@google.com> wrote:
>>>>>> >> >>
>>>>>> >> >> I would like to start a discussion about identifying a guideline 
>>>>>> >> >> for answering questions like:
>>>>>> >> >>
>>>>>> >> >> 1. When will Beam support a new Python version (say, Python 3.8)?
>>>>>> >> >> 2. When will Beam drop support for an old Python version (say, 
>>>>>> >> >> Python 3.5)?
>>>>>> >> >> 3. How many Python versions should we aim to support concurrently 
>>>>>> >> >> (investigate issues, have continuous integration tests)?
>>>>>> >> >> 4. What comes first: adding support for a new version (3.8) or 
>>>>>> >> >> deprecating older one (3.5)? This may affect the max load our test 
>>>>>> >> >> infrastructure needs to sustain.
>>>>>> >> >>
>>>>>> >> >> We are already getting requests for supporting Python 3.8 and 
>>>>>> >> >> there were some good reasons[1] to drop support for Python 3.5 (at 
>>>>>> >> >> least, early versions of 3.5). Answering these questions would 
>>>>>> >> >> help set expectations in Beam user community, Beam dev community, 
>>>>>> >> >> and  may help us establish resource requirements for test 
>>>>>> >> >> infrastructure and plan efforts.
>>>>>> >> >>
>>>>>> >> >> PEP-0602 [2] establishes a yearly release cycle for Python 
>>>>>> >> >> versions starting from 3.9. Each release is a long-term support 
>>>>>> >> >> release and is supported for 5 years: first 1.5 years allow for 
>>>>>> >> >> general bug fix support, remaining 3.5 years have security fix 
>>>>>> >> >> support.
>>>>>> >> >>
>>>>>> >> >> At every point, there may be up to 5 Python minor versions that 
>>>>>> >> >> did not yet reach EOL, see "Release overlap with 12 month diagram" 
>>>>>> >> >> [3]. We can try to support all of them, but that may come at a 
>>>>>> >> >> cost of velocity: we will have more tests to maintain, and we will 
>>>>>> >> >> have to develop Beam against a lower version for a longer period. 
>>>>>> >> >> Supporting less versions will have implications for user 
>>>>>> >> >> experience. It also may be difficult to ensure support of the most 
>>>>>> >> >> recent version early, since our  dependencies (e.g. picklers) may 
>>>>>> >> >> not be supporting them yet.
>>>>>> >> >>
>>>>>> >> >> Currently we support 4 Python versions (2.7, 3.5, 3.6, 3.7).
>>>>>> >> >>
>>>>>> >> >> Is 4 versions a sweet spot? Too much? Too little? What do you 
>>>>>> >> >> think?
>>>>>> >> >>
>>>>>> >> >> [1] 
>>>>>> >> >> https://github.com/apache/beam/pull/10821#issuecomment-590167711
>>>>>> >> >> [2] https://www.python.org/dev/peps/pep-0602/
>>>>>> >> >> [3] https://www.python.org/dev/peps/pep-0602/#id17

Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

Reply via email to