robertwb commented on code in PR #22230:
URL: https://github.com/apache/beam/pull/22230#discussion_r920392179
##########
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java:
##########
@@ -898,8 +898,19 @@ protected RunnerApi.Pipeline applySdkEnvironmentOverrides(
throw new RuntimeException("Error parsing environment docker
payload.", e);
}
String containerImage = dockerPayload.getContainerImage();
+ boolean updated = false;
for (int i = 0; i < overrides.length; i += 2) {
containerImage = containerImage.replaceAll(overrides[i], overrides[i
+ 1]);
+ if (!containerImage.equals(dockerPayload.getContainerImage())) {
+ updated = true;
+ }
+ }
+ if (containerImage.startsWith("apache/beam")
Review Comment:
Transform expansion should never depend on the runner; if it does, that's a
bug.
We should definitely provide a docker hub mirror for official beam images on
gcr and configure it as a default container registry instead of external docker
hub; I don't know how hard that is to set up.
It still feels a bit sketchy for this code to live in each SDK as it seems
to be a service/runner issue, but I suppose DataflowRunner is "part" of the
runner.
Also, are we sure that for every `apache/beamX` there is always a
corresponding `gcr.io/cloud-dataflow/v1beta3/beamX`, even for dev containers
and such? (Specifically, isn't there a gap between when Beam releases happen
and the dataflow dev containers are available?)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]