Daniel Lescohier created BEAM-6955:
--------------------------------------
Summary: Support Dataflow --sdk_location with modified version
number
Key: BEAM-6955
URL: https://issues.apache.org/jira/browse/BEAM-6955
Project: Beam
Issue Type: Bug
Components: runner-dataflow
Affects Versions: 2.11.0
Reporter: Daniel Lescohier
Support Dataflow --sdk_location with modified version number
Determine the version tag to use for the Google Container Registry, for the
service image versions to use on the Dataflow worker nodes. Users of Dataflow
may be using a locally-modified version of Apache Beam, which they submit to
Dataflow with the --sdk_location option. Those users would most likely modify
the version number of Apache Beam, so they can distinguish it from the public
distribution of Apache Beam. However, the remote nodes in Dataflow still need
to bootsrap the worker service with a Docker image that a version tag exists
for.
The most appropriate way for system integrators to modify the Apache Beam
version number would be to add a Local Version Identifier:
https://www.python.org/dev/peps/pep-0440/#local-version-identifiers
If people only use Local Version Identifiers, then we could use the "public"
attribute of the pkg_resources version object.
If people instead use a post-release version identifier:
https://www.python.org/dev/peps/pep-0440/#post-releases then only the
"base_version" attribute would work both of these version number changes.
Since Dataflow documentation does not specify how to modify version numbers, I
am choosing to use "base_version" attribute.
Will shortly submit a PR with the change.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)