Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

Kyle Weaver Mon, 11 May 2020 10:10:23 -0700

We've since moved our official Docker images here:
https://hub.docker.com/search?q=apache%2Fbeam_python&type=image


But Docker downloads are not as representative of actual usage as PyPI.

On Mon, May 11, 2020 at 1:05 PM Yoshiki Obata <yoshiki.ob...@gmail.com>
wrote:

> Hello again,
>
> Test infrastructure update is ongoing and then we should determine
> which Python versions are high-priority.
>
> According to Pypi downloads stats[1], download proportion of Python
> 3.5 is almost always greater than one of 3.6 and 3.7.
> This situation has not changed since Robert told us Python 3.x
> occupies nearly 40% of downloads[2]
>
> On the other hand, according to docker hub[3],
> apachebeam/python3.x_sdk image downloaded the most is one of Python
> 3.7 which was pointed by Kyle[4].
>
> Considering these stats, I think high-priority versions are 3.5 and 3.7.
>
> Is this assumption appropriate?
> I would like to hear your thoughts about this.
>
> [1] https://pypistats.org/packages/apache-beam
> [2]
> https://lists.apache.org/thread.html/r208c0d11639e790453a17249e511dbfe00a09f91bef8fcd361b4b74a%40%3Cdev.beam.apache.org%3E
> [3] https://hub.docker.com/search?q=apachebeam%2Fpython&type=image
> [4]
> https://lists.apache.org/thread.html/r9ca9ad316dae3d60a3bf298eedbe4aeecab2b2664454cc352648abc9%40%3Cdev.beam.apache.org%3E
>
> 2020年5月6日(水) 12:48 Yoshiki Obata <yoshiki.ob...@gmail.com>:
> >
> > > Not sure how run_pylint.sh is related here - we should run linter on
> the entire codebase.
> > ah, I mistyped... I meant run_pytest.sh
> >
> > > I am familiar with beam_PostCommit_PythonXX suites. Is there something
> specific about these suites that you wanted to know?
> > Test suite runtime will depend on the number of  tests in the suite,
> > how many tests we run in parallel, how long they take to run. To
> > understand the load on test infrastructure we can monitor Beam test
> > health metrics [1]. In particular, if time in queue[2] is high, it is
> > a sign that there are not enough Jenkins slots available to start the
> > test suite earlier.
> > Sorry for ambiguous question. I wanted to know how to see the load on
> > test infrastructure.
> > The Grafana links you showed serves my purpose. Thank you.
> >
> > 2020年5月6日(水) 2:35 Valentyn Tymofieiev <valen...@google.com>:
> > >
> > > On Mon, May 4, 2020 at 7:06 PM Yoshiki Obata <yoshiki.ob...@gmail.com>
> wrote:
> > >>
> > >> Thank you for comment, Valentyn.
> > >>
> > >> > 1) We can seed the smoke test suite with typehints tests, and add
> more tests later if there is a need. We can identify them by the file path
> or by special attributes in test files. Identifying them using filepath
> seems simpler and independent of test runner.
> > >>
> > >> Yes, making run_pylint.sh allow target test file paths as arguments is
> > >> good way if could.
> > >
> > >
> > > Not sure how run_pylint.sh is related here - we should run linter on
> the entire codebase.
> > >
> > >>
> > >> > 3)  We should reduce the code duplication across
> beam/sdks/python/test-suites/$runner/py3*. I think we could move the suite
> definition into a common file like
> beam/sdks/python/test-suites/$runner/build.gradle perhaps, and populate
> individual suites (beam/sdks/python/test-suites/$runner/py38/build.gradle)
> including the common file and/or logic from PythonNature [1].
> > >>
> > >> Exactly. I'll check it out.
> > >>
> > >> > 4) We have some tests that we run only under specific Python 3
> versions, for example: FlinkValidatesRunner test runs using Python 3.5: [2]
> > >> > HDFS Python 3 tests are running only with Python 3.7 [3].
> Cross-language Py3 tests for Spark are running under Python 3.5[4]: , there
> may be more test suites that selectively use particular versions.
> > >> > We need to correct such suites, so that we do not tie them  to a
> specific Python version. I see several options here: such tests should run
> either for all high-priority versions, or run only under the lowest version
> among the high-priority versions.  We don't have to fix them all at the
> same time. In general, we should try to make it as easy as possible to
> configure, whether a suite runs across all  versions, all high-priority
> versions, or just one version.
> > >>
> > >> The way of high-priority/low-priority configuration would be useful
> for this.
> > >> And which versions to be tested may be related to 5).
> > >>
> > >> > 5) If postcommit suites (that need to run against all versions)
> still constitute too much load on the infrastructure, we may need to
> investigate how to run these suites less frequently.
> > >>
> > >> That's certainly true, beam_PostCommit_PythonXX and
> > >> beam_PostCommit_Python_Chicago_Taxi_(Dataflow|Flink) take about 1
> > >> hour.
> > >> Does anyone have knowledge about this?
> > >
> > >
> > > I am familiar with beam_PostCommit_PythonXX suites. Is there something
> specific about these suites that you wanted to know?
> > > Test suite runtime will depend on the number of  tests in the suite,
> how many tests we run in parallel, how long they take to run. To understand
> the load on test infrastructure we can monitor Beam test health metrics
> [1]. In particular, if time in queue[2] is high, it is a sign that there
> are not enough Jenkins slots available to start the test suite earlier.
> > >
> > > [1] http://104.154.241.245/d/D81lW0pmk/post-commit-test-reliability
> > > [2]
> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=1588094891600&to=1588699691600&panelId=6&fullscreen
> > >
> > >
> > >>
> > >> 2020年5月2日(土) 5:18 Valentyn Tymofieiev <valen...@google.com>:
> > >> >
> > >> > Hi Yoshiki,
> > >> >
> > >> > Thanks a lot for your help with Python 3 support so far and most
> recently, with your work on Python 3.8.
> > >> >
> > >> > Overall the proposal sounds good to me. I see several aspects here
> that we need to address:
> > >> >
> > >> > 1) We can seed the smoke test suite with typehints tests, and add
> more tests later if there is a need. We can identify them by the file path
> or by special attributes in test files. Identifying them using filepath
> seems simpler and independent of test runner.
> > >> >
> > >> > 2) Defining high priority/low priority versions in
> gradle.properties sounds good to me.
> > >> >
> > >> > 3)  We should reduce the code duplication across
> beam/sdks/python/test-suites/$runner/py3*. I think we could move the suite
> definition into a common file like
> beam/sdks/python/test-suites/$runner/build.gradle perhaps, and populate
> individual suites (beam/sdks/python/test-suites/$runner/py38/build.gradle)
> including the common file and/or logic from PythonNature [1].
> > >> >
> > >> > 4) We have some tests that we run only under specific Python 3
> versions, for example: FlinkValidatesRunner test runs using Python 3.5: [2]
> > >> > HDFS Python 3 tests are running only with Python 3.7 [3].
> Cross-language Py3 tests for Spark are running under Python 3.5[4]: , there
> may be more test suites that selectively use particular versions.
> > >> >
> > >> > We need to correct such suites, so that we do not tie them  to a
> specific Python version. I see several options here: such tests should run
> either for all high-priority versions, or run only under the lowest version
> among the high-priority versions.  We don't have to fix them all at the
> same time. In general, we should try to make it as easy as possible to
> configure, whether a suite runs across all  versions, all high-priority
> versions, or just one version.
> > >> >
> > >> > 5) If postcommit suites (that need to run against all versions)
> still constitute too much load on the infrastructure, we may need to
> investigate how to run these suites less frequently.
> > >> >
> > >> > [1]
> https://github.com/apache/beam/blob/b78c7ed4836e44177a149155581cfa8188e8f748/sdks/python/test-suites/portable/py37/build.gradle#L19-L20
> > >> > [2]
> https://github.com/apache/beam/blob/93181e792f648122d3b4a5080d683f21c6338132/.test-infra/jenkins/job_PostCommit_Python35_ValidatesRunner_Flink.groovy#L34
> > >> > [3]
> https://github.com/apache/beam/blob/93181e792f648122d3b4a5080d683f21c6338132/sdks/python/test-suites/direct/py37/build.gradle#L58
> > >> > [4]
> https://github.com/apache/beam/blob/93181e792f648122d3b4a5080d683f21c6338132/.test-infra/jenkins/job_PostCommit_CrossLanguageValidatesRunner_Spark.groovy#L44
> > >> >
> > >> > On Fri, May 1, 2020 at 8:42 AM Yoshiki Obata <
> yoshiki.ob...@gmail.com> wrote:
> > >> >>
> > >> >> Hello everyone.
> > >> >>
> > >> >> I'm working on Python 3.8 support[1] and now is the time for
> preparing
> > >> >> test infrastructure.
> > >> >> According to the discussion, I've considered how to prioritize
> tests.
> > >> >> My plan is as below. I'd like to get your thoughts on this.
> > >> >>
> > >> >> - With all low-pri Python, apache_beam.typehints.*_test run in the
> > >> >> PreCommit test.
> > >> >>   New gradle task should be defined like "preCommitPy3*-minimum".
> > >> >>   If there are essential tests for all versions other than
> typehints,
> > >> >> please point out.
> > >> >>
> > >> >> - With high-pri Python, the same tests as running in the current
> > >> >> PreCommit test run for testing extensively;
> "tox:py3*:preCommitPy3*",
> > >> >> "dataflow:py3*:preCommitIT" and "dataflow:py3*:preCommitIT_V2".
> > >> >>
> > >> >> - Low-pri versions' whole PreCommit tests are moved to each
> PostCommit tests.
> > >> >>
> > >> >> - High-pri and low-pri versions are defined in gralde.properties
> and
> > >> >> PreCommit/PostCommit task dependencies are built dynamically
> according
> > >> >> to them.
> > >> >>   It would be easy for switching priorities of Python versions.
> > >> >>
> > >> >> [1] https://issues.apache.org/jira/browse/BEAM-8494
> > >> >>
> > >> >> 2020年4月4日(土) 7:51 Robert Bradshaw <rober...@google.com>:
> > >> >> >
> > >> >> > https://pypistats.org/packages/apache-beam is an interesting
> data point.
> > >> >> >
> > >> >> > The good news: Python 3.x more than doubled to nearly 40% of
> downloads last month. Interestingly, it looks like a good chunk of this
> increase was 3.5 (which is now the most popular 3.x version by this
> metric...)
> > >> >> >
> > >> >> > I agree with using Python EOL dates as a baseline, with the
> possibility of case-by-case adjustments. Refactoring our tests to support
> 3.8 without increasing the load should be our focus now.
> > >> >> >
> > >> >> >
> > >> >> > On Fri, Apr 3, 2020 at 3:41 PM Valentyn Tymofieiev <
> valen...@google.com> wrote:
> > >> >> >>
> > >> >> >> Some good news on  Python 3.x support: thanks to +David Song
> and +Yifan Zou we now have Python 3.8 on Jenkins, and can start working on
> adding Python 3.8 support to Beam (BEAM-8494).
> > >> >> >>
> > >> >> >>> One interesting variable that has not being mentioned is what
> versions of python 3
> > >> >> >>> are available to users via their distribution channels (the
> linux
> > >> >> >>> distributions they use to develop/run the pipelines).
> > >> >> >>
> > >> >> >>
> > >> >> >> Good point. Looking at Ubuntu 16.04, which comes with Python
> 3.5.2, we can see that  the end-of-life for 16.04 is in 2024,
> end-of-support is April 2021 [1]. Both of these dates are beyond the
> announced Python 3.5 EOL in September 2020 [2]. I think it would be
> difficult for Beam to keep Py3.5 support until these EOL dates, and users
> of systems that stock old versions of Python have viable workarounds:
> > >> >> >> - install a newer version of Python interpreter via pyenv[3],
> from sources, or from alternative repositories.
> > >> >> >> - use a docker container that comes with a newer version of
> interpreter.
> > >> >> >> - use older versions of Beam.
> > >> >> >>
> > >> >> >> We didn't receive feedback from user@ on how long 3.x versions
> on the lower/higher end of the range should stay supported.  I would
> suggest for now that we plan to support all Python 3.x versions that were
> released and did not reach EOL. We can discuss exceptions to this rule on a
> case-by-case basis, evaluating any maintenance burden to continue support,
> or stop early.
> > >> >> >>
> > >> >> >> We should now focus on adjusting our Python test infrastructure
> to make it easy to split 3.5, 3.6, 3.7, 3.8  suites into high-priority and
> low-priority suites according to the Python version. Ideally, we should
> make it easy to change which versions are high/low priority without having
> to change all the individual test suites, and without losing test coverage
> signal.
> > >> >> >>
> > >> >> >> [1] https://wiki.ubuntu.com/Releases
> > >> >> >> [2] https://devguide.python.org/#status-of-python-branches
> > >> >> >> [3] https://github.com/pyenv/pyenv/blob/master/README.md
> > >> >> >>
> > >> >> >> On Fri, Feb 28, 2020 at 1:25 AM Ismaël Mejía <ieme...@gmail.com>
> wrote:
> > >> >> >>>
> > >> >> >>> One interesting variable that has not being mentioned is what
> versions of python
> > >> >> >>> 3 are available to users via their distribution channels (the
> linux
> > >> >> >>> distributions they use to develop/run the pipelines).
> > >> >> >>>
> > >> >> >>> - RHEL 8 users have python 3.6 available
> > >> >> >>> - RHEL 7 users have python 3.6 available
> > >> >> >>> - Debian 10/Ubuntu 18.04 users have python 3.7/3.6 available
> > >> >> >>> - Debian 9/Ubuntu 16.04 users have python 3.5 available
> > >> >> >>>
> > >> >> >>>
> > >> >> >>> We should consider this when we evaluate future support
> removals.
> > >> >> >>>
> > >> >> >>> Given  that the distros that support python 3.5 are ~4y old
> and since python 3.5
> > >> >> >>> is also losing LTS support soon is probably ok to not support
> it in Beam
> > >> >> >>> anymore as Robert suggests.
> > >> >> >>>
> > >> >> >>>
> > >> >> >>> On Thu, Feb 27, 2020 at 3:57 AM Valentyn Tymofieiev <
> valen...@google.com> wrote:
> > >> >> >>>>
> > >> >> >>>> Thanks everyone for sharing your perspectives so far. It
> sounds like we can mitigate the cost of test infrastructure by having:
> > >> >> >>>> - a selection of (fast) tests that we will want to run
> against all Python versions we support.
> > >> >> >>>> - high priority Python versions, which we will test
> extensively.
> > >> >> >>>> - infrequent postcommit test that exercise low-priority
> versions.
> > >> >> >>>> We will need test infrastructure improvements to have the
> flexibility of designating versions of high-pri/low-pri and minimizing
> efforts requiring adopting a new version.
> > >> >> >>>>
> > >> >> >>>> There is still a question of how long we want to support old
> Py3.x versions. As mentioned above, I think we should not support them
> beyond EOL (5 years after a release). I wonder if that is still too long.
> The cost of supporting a version may include:
> > >> >> >>>>  - Developing against older Python version
> > >> >> >>>>  - Release overhead (building & storing containers, wheels,
> doing release validation)
> > >> >> >>>>  - Complexity / development cost to support the quirks of the
> minor versions.
> > >> >> >>>>
> > >> >> >>>> We can decide to drop support, after, say, 4 years, or after
> usage drops below a threshold, or decide on a case-by-case basis. Thoughts?
> Also asked for feedback on user@ [1]
> > >> >> >>>>
> > >> >> >>>> [1]
> https://lists.apache.org/thread.html/r630a3b55aa8e75c68c8252ea6f824c3ab231ad56e18d916dfb84d9e8%40%3Cuser.beam.apache.org%3E
> > >> >> >>>>
> > >> >> >>>> On Wed, Feb 26, 2020 at 5:27 PM Robert Bradshaw <
> rober...@google.com> wrote:
> > >> >> >>>>>
> > >> >> >>>>> On Wed, Feb 26, 2020 at 5:21 PM Valentyn Tymofieiev <
> valen...@google.com> wrote:
> > >> >> >>>>> >
> > >> >> >>>>> > > +1 to consulting users.
> > >> >> >>>>> > I will message user@ as well and point to this thread.
> > >> >> >>>>> >
> > >> >> >>>>> > > I would propose getting in warnings about 3.5 EoL well
> ahead of time.
> > >> >> >>>>> > I think we should document on our website, and  in the
> code (warnings) that users should not expect SDKs to be supported in Beam
> beyond the EOL. If we want to have flexibility to drop support earlier than
> EOL, we need to be more careful with messaging because users might
> otherwise expect that support will last until EOL, if we mention EOL date.
> > >> >> >>>>>
> > >> >> >>>>> +1
> > >> >> >>>>>
> > >> >> >>>>> > I am hoping that we can establish a consensus for when we
> will be dropping support for a version, so that we don't have to discuss it
> on a case by case basis in the future.
> > >> >> >>>>> >
> > >> >> >>>>> > > I think it would makes sense to add support for 3.8
> right away (or at least get a good sense of what work needs to be done and
> what our dependency situation is like)
> > >> >> >>>>> > https://issues.apache.org/jira/browse/BEAM-8494 is a
> starting point. I tried 3.8 a while ago some dependencies were not able to
> install, checked again just now. SDK is "installable" after minor changes.
> Some tests don't pass. BEAM-8494 does not have an owner atm, and if anyone
> is interested I'm happy to give further pointers and help get started.
> > >> >> >>>>> >
> > >> >> >>>>> > > For the 3.x series, I think we will get the most signal
> out of the lowest and highest version, and can get by with smoke tests +
> > >> >> >>>>> > infrequent post-commits for the ones between.
> > >> >> >>>>> >
> > >> >> >>>>> > > I agree with having low-frequency tests for low-priority
> versions. Low-priority versions could be determined according to least
> usage.
> > >> >> >>>>> >
> > >> >> >>>>> > These are good ideas. Do you think we will want to have an
> ability  to run some (inexpensive) tests for all versions  frequently (on
> presubmits), or this is extra complexity that can be avoided? I am thinking
> about type inference for example. Afaik inference logic is very sensitive
> to the version. Would it be acceptable to catch  errors there in infrequent
> postcommits or an early signal will be preferred?
> > >> >> >>>>>
> > >> >> >>>>> This is a good example--the type inference tests are
> sensitive to
> > >> >> >>>>> version (due to using internal details and relying on the
> > >> >> >>>>> still-evolving typing module) but also run in ~15 seconds. I
> think
> > >> >> >>>>> these should be in precommits. We just don't need to run
> every test
> > >> >> >>>>> for every version.
> > >> >> >>>>>
> > >> >> >>>>> > On Wed, Feb 26, 2020 at 5:17 PM Kyle Weaver <
> kcwea...@google.com> wrote:
> > >> >> >>>>> >>
> > >> >> >>>>> >> Oh, I didn't see Robert's earlier email:
> > >> >> >>>>> >>
> > >> >> >>>>> >> > Currently 3.5 downloads sit at 3.7%, or about
> > >> >> >>>>> >> > 20% of all Python 3 downloads.
> > >> >> >>>>> >>
> > >> >> >>>>> >> Where did these numbers come from?
> > >> >> >>>>> >>
> > >> >> >>>>> >> On Wed, Feb 26, 2020 at 5:15 PM Kyle Weaver <
> kcwea...@google.com> wrote:
> > >> >> >>>>> >>>
> > >> >> >>>>> >>> > I agree with having low-frequency tests for
> low-priority versions.
> > >> >> >>>>> >>> > Low-priority versions could be determined according to
> least usage.
> > >> >> >>>>> >>>
> > >> >> >>>>> >>> +1. While the difference may not be as great between,
> say, 3.6 and 3.7, I think that if we had to choose, it would be more useful
> to test the versions folks are actually using the most. 3.5 only has about
> a third of the Docker pulls of 3.6 or 3.7 [1]. Does anyone have other usage
> statistics we can consult?
> > >> >> >>>>> >>>
> > >> >> >>>>> >>> [1]
> https://hub.docker.com/search?q=apachebeam%2Fpython&type=image
> > >> >> >>>>> >>>
> > >> >> >>>>> >>> On Wed, Feb 26, 2020 at 5:00 PM Ruoyun Huang <
> ruo...@google.com> wrote:
> > >> >> >>>>> >>>>
> > >> >> >>>>> >>>> I feel 4+ versions take too long to run anything.
> > >> >> >>>>> >>>>
> > >> >> >>>>> >>>> would vote for lowest + highest,  2 versions.
> > >> >> >>>>> >>>>
> > >> >> >>>>> >>>> On Wed, Feb 26, 2020 at 4:52 PM Udi Meiri <
> eh...@google.com> wrote:
> > >> >> >>>>> >>>>>
> > >> >> >>>>> >>>>> I agree with having low-frequency tests for
> low-priority versions.
> > >> >> >>>>> >>>>> Low-priority versions could be determined according to
> least usage.
> > >> >> >>>>> >>>>>
> > >> >> >>>>> >>>>>
> > >> >> >>>>> >>>>>
> > >> >> >>>>> >>>>> On Wed, Feb 26, 2020 at 4:06 PM Robert Bradshaw <
> rober...@google.com> wrote:
> > >> >> >>>>> >>>>>>
> > >> >> >>>>> >>>>>> On Wed, Feb 26, 2020 at 3:29 PM Kenneth Knowles <
> k...@apache.org> wrote:
> > >> >> >>>>> >>>>>> >
> > >> >> >>>>> >>>>>> > Are these divergent enough that they all need to
> consume testing resources? For example can lower priority versions be daily
> runs or some such?
> > >> >> >>>>> >>>>>>
> > >> >> >>>>> >>>>>> For the 3.x series, I think we will get the most
> signal out of the
> > >> >> >>>>> >>>>>> lowest and highest version, and can get by with smoke
> tests +
> > >> >> >>>>> >>>>>> infrequent post-commits for the ones between.
> > >> >> >>>>> >>>>>>
> > >> >> >>>>> >>>>>> > Kenn
> > >> >> >>>>> >>>>>> >
> > >> >> >>>>> >>>>>> > On Wed, Feb 26, 2020 at 3:25 PM Robert Bradshaw <
> rober...@google.com> wrote:
> > >> >> >>>>> >>>>>> >>
> > >> >> >>>>> >>>>>> >> +1 to consulting users. Currently 3.5 downloads
> sit at 3.7%, or about
> > >> >> >>>>> >>>>>> >> 20% of all Python 3 downloads.
> > >> >> >>>>> >>>>>> >>
> > >> >> >>>>> >>>>>> >> I would propose getting in warnings about 3.5 EoL
> well ahead of time,
> > >> >> >>>>> >>>>>> >> at the very least as part of the 2.7 warning.
> > >> >> >>>>> >>>>>> >>
> > >> >> >>>>> >>>>>> >> Fortunately, supporting multiple 3.x versions is
> significantly easier
> > >> >> >>>>> >>>>>> >> than spanning 2.7 and 3.x. I would rather not
> impose an ordering on
> > >> >> >>>>> >>>>>> >> dropping 3.5 and adding 3.8 but consider their
> merits independently.
> > >> >> >>>>> >>>>>> >>
> > >> >> >>>>> >>>>>> >>
> > >> >> >>>>> >>>>>> >> On Wed, Feb 26, 2020 at 3:16 PM Kyle Weaver <
> kcwea...@google.com> wrote:
> > >> >> >>>>> >>>>>> >> >
> > >> >> >>>>> >>>>>> >> > 5 versions is too many IMO. We've had issues
> with Python precommit resource usage in the past, and adding another
> version would surely exacerbate those issues. And we have also already had
> to leave out certain features on 3.5 [1]. Therefore, I am in favor of
> dropping 3.5 before adding 3.8. After dropping Python 2 and adding 3.8,
> that will leave us with the latest three minor versions (3.6, 3.7, 3.8),
> which I think is closer to the "sweet spot." Though I would be interested
> in hearing if there are any users who would prefer we continue supporting
> 3.5.
> > >> >> >>>>> >>>>>> >> >
> > >> >> >>>>> >>>>>> >> > [1]
> https://github.com/apache/beam/blob/8658b95545352e51f35959f38334f3c7df8b48eb/sdks/python/apache_beam/runners/portability/flink_runner.py#L55
> > >> >> >>>>> >>>>>> >> >
> > >> >> >>>>> >>>>>> >> > On Wed, Feb 26, 2020 at 3:00 PM Valentyn
> Tymofieiev <valen...@google.com> wrote:
> > >> >> >>>>> >>>>>> >> >>
> > >> >> >>>>> >>>>>> >> >> I would like to start a discussion about
> identifying a guideline for answering questions like:
> > >> >> >>>>> >>>>>> >> >>
> > >> >> >>>>> >>>>>> >> >> 1. When will Beam support a new Python version
> (say, Python 3.8)?
> > >> >> >>>>> >>>>>> >> >> 2. When will Beam drop support for an old
> Python version (say, Python 3.5)?
> > >> >> >>>>> >>>>>> >> >> 3. How many Python versions should we aim to
> support concurrently (investigate issues, have continuous integration
> tests)?
> > >> >> >>>>> >>>>>> >> >> 4. What comes first: adding support for a new
> version (3.8) or deprecating older one (3.5)? This may affect the max load
> our test infrastructure needs to sustain.
> > >> >> >>>>> >>>>>> >> >>
> > >> >> >>>>> >>>>>> >> >> We are already getting requests for supporting
> Python 3.8 and there were some good reasons[1] to drop support for Python
> 3.5 (at least, early versions of 3.5). Answering these questions would help
> set expectations in Beam user community, Beam dev community, and  may help
> us establish resource requirements for test infrastructure and plan efforts.
> > >> >> >>>>> >>>>>> >> >>
> > >> >> >>>>> >>>>>> >> >> PEP-0602 [2] establishes a yearly release cycle
> for Python versions starting from 3.9. Each release is a long-term support
> release and is supported for 5 years: first 1.5 years allow for general bug
> fix support, remaining 3.5 years have security fix support.
> > >> >> >>>>> >>>>>> >> >>
> > >> >> >>>>> >>>>>> >> >> At every point, there may be up to 5 Python
> minor versions that did not yet reach EOL, see "Release overlap with 12
> month diagram" [3]. We can try to support all of them, but that may come at
> a cost of velocity: we will have more tests to maintain, and we will have
> to develop Beam against a lower version for a longer period. Supporting
> less versions will have implications for user experience. It also may be
> difficult to ensure support of the most recent version early, since our
> dependencies (e.g. picklers) may not be supporting them yet.
> > >> >> >>>>> >>>>>> >> >>
> > >> >> >>>>> >>>>>> >> >> Currently we support 4 Python versions (2.7,
> 3.5, 3.6, 3.7).
> > >> >> >>>>> >>>>>> >> >>
> > >> >> >>>>> >>>>>> >> >> Is 4 versions a sweet spot? Too much? Too
> little? What do you think?
> > >> >> >>>>> >>>>>> >> >>
> > >> >> >>>>> >>>>>> >> >> [1]
> https://github.com/apache/beam/pull/10821#issuecomment-590167711
> > >> >> >>>>> >>>>>> >> >> [2] https://www.python.org/dev/peps/pep-0602/
> > >> >> >>>>> >>>>>> >> >> [3]
> https://www.python.org/dev/peps/pep-0602/#id17
>

Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

Reply via email to