Hi Chad,

I am sorry to hear that you are still impacted by the lack of Python 3
support in some of the libraries used in the VFX industry.

The plan to sunset Python 2 support in Dataflow on October 7 2020 was
announced a while ago in the Dataflow UI, website and other channels. With
each new Beam release starting from 2.17.0, the SDK support status page[1]
included a note that on October 7, Dataflow will stop supporting pipelines
using Python 2. This timeline was chosen irrespective of the status of
Python 2 sunset in Beam.

Python 2 pipelines will continue to work for some time, however some
eventually may stop working due to changes in the infrastructure that
Dataflow does not control.

Pipelines using Apache Beam SDKs that are no longer supported  may be
interrupted at submission[2] to make sure that users are aware that they
are using an unsupported SDK and their pipelines are at risk[3] of failing
abruptly and permanently.

Hope this answers your questions.

Note that :
- with portable execution of Beam pipelines it should be possible to hide
Py2 portions of  pipeline behind an expansion service, and run a Python 3
pipeline with "cross-language" Python 2 transforms;
- users can  launch Dataflow jobs using custom SDKs (and without support
guarantees), be it Go, Python 2 or C#.

Thanks,
Valentyn

[1]
https://cloud.google.com/dataflow/docs/support/sdk-version-support-status#apache-beam-2x-sdks
[2] https://cloud.google.com/dataflow/docs/support/using-unsupported-sdks
[3] Python Batch pipelines using shuffle (GroupByKey) running on
non-portable infrastructure (2.22.0 or older SDKs that do not support
--experiment=use_runner_v2) are at risk.

On Tue, Oct 13, 2020 at 11:27 AM Chad Dombrova <[email protected]> wrote:

> Hi all,
> Those of you who have been following the python2/3 topics on this thread
> know that the industry that I work for is a bit behind the times;  we're
> still waiting for all the libraries we need to be ported to python3.
> Here's a handy website for those who are curious: https://vfxpy.com/
>
> I'm hoping that we'll have most of what we need by EOY or shortly
> thereafter, but on the Beam side that leaves us with a gap of 3 months or
> more.  I was hoping that we'd be able to coast over that gap using Beam
> 2.24, but Dataflow dropped support for python2 a week ago [1].
>
> So my question for the Dataflow team is this:  if we lock down our
> pipelines to use 2.24 -- the last version of Beam to officially support
> python2 -- and run our SDK workers inside docker containers, will new jobs
> submitted to Dataflow continue to work on python2?   Or will the Dataflow
> runner itself stop being able to execute python2 code regardless of the
> version of Beam used?  As an example, I know that PubSubIO, which we use
> extensively, is not part of Beam itself, but is sorta magically patched
> into the pipeline by Dataflow.
>
> I will say that when I agreed that I was satisfied that Beam 2.24 would be
> the last python2 release to support python2, I did so under the assumption
> that pipelines running 2.24 would be supported for longer than a few
> weeks:  2.24 was released to PyPI on Sept 16, and python2 pipelines became
> unsupported on Dataflow October 7th.
>
> thanks,
> -chad
>
> [1] https://cloud.google.com/python/docs/python2-sunset/#dataflow
>
>

Reply via email to