Hi Chad, I am sorry to hear that you are still impacted by the lack of Python 3 support in some of the libraries used in the VFX industry.
The plan to sunset Python 2 support in Dataflow on October 7 2020 was announced a while ago in the Dataflow UI, website and other channels. With each new Beam release starting from 2.17.0, the SDK support status page[1] included a note that on October 7, Dataflow will stop supporting pipelines using Python 2. This timeline was chosen irrespective of the status of Python 2 sunset in Beam. Python 2 pipelines will continue to work for some time, however some eventually may stop working due to changes in the infrastructure that Dataflow does not control. Pipelines using Apache Beam SDKs that are no longer supported may be interrupted at submission[2] to make sure that users are aware that they are using an unsupported SDK and their pipelines are at risk[3] of failing abruptly and permanently. Hope this answers your questions. Note that : - with portable execution of Beam pipelines it should be possible to hide Py2 portions of pipeline behind an expansion service, and run a Python 3 pipeline with "cross-language" Python 2 transforms; - users can launch Dataflow jobs using custom SDKs (and without support guarantees), be it Go, Python 2 or C#. Thanks, Valentyn [1] https://cloud.google.com/dataflow/docs/support/sdk-version-support-status#apache-beam-2x-sdks [2] https://cloud.google.com/dataflow/docs/support/using-unsupported-sdks [3] Python Batch pipelines using shuffle (GroupByKey) running on non-portable infrastructure (2.22.0 or older SDKs that do not support --experiment=use_runner_v2) are at risk. On Tue, Oct 13, 2020 at 11:27 AM Chad Dombrova <[email protected]> wrote: > Hi all, > Those of you who have been following the python2/3 topics on this thread > know that the industry that I work for is a bit behind the times; we're > still waiting for all the libraries we need to be ported to python3. > Here's a handy website for those who are curious: https://vfxpy.com/ > > I'm hoping that we'll have most of what we need by EOY or shortly > thereafter, but on the Beam side that leaves us with a gap of 3 months or > more. I was hoping that we'd be able to coast over that gap using Beam > 2.24, but Dataflow dropped support for python2 a week ago [1]. > > So my question for the Dataflow team is this: if we lock down our > pipelines to use 2.24 -- the last version of Beam to officially support > python2 -- and run our SDK workers inside docker containers, will new jobs > submitted to Dataflow continue to work on python2? Or will the Dataflow > runner itself stop being able to execute python2 code regardless of the > version of Beam used? As an example, I know that PubSubIO, which we use > extensively, is not part of Beam itself, but is sorta magically patched > into the pipeline by Dataflow. > > I will say that when I agreed that I was satisfied that Beam 2.24 would be > the last python2 release to support python2, I did so under the assumption > that pipelines running 2.24 would be supported for longer than a few > weeks: 2.24 was released to PyPI on Sept 16, and python2 pipelines became > unsupported on Dataflow October 7th. > > thanks, > -chad > > [1] https://cloud.google.com/python/docs/python2-sunset/#dataflow > >
