ihji commented on code in PR #22230:
URL: https://github.com/apache/beam/pull/22230#discussion_r919455115
##########
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java:
##########
@@ -898,8 +898,19 @@ protected RunnerApi.Pipeline applySdkEnvironmentOverrides(
throw new RuntimeException("Error parsing environment docker
payload.", e);
}
String containerImage = dockerPayload.getContainerImage();
+ boolean updated = false;
for (int i = 0; i < overrides.length; i += 2) {
containerImage = containerImage.replaceAll(overrides[i], overrides[i
+ 1]);
+ if (!containerImage.equals(dockerPayload.getContainerImage())) {
+ updated = true;
+ }
+ }
+ if (containerImage.startsWith("apache/beam")
Review Comment:
It updates foreign images including Python and Go but not a default Java
image set by current SDK environment.
I think this can be done by the service too but I think it's safer and has
more flexibility when we configure the appropriate URLs from the SDK. Usually
we don't encounter this problem since we know the runner by the time we convert
pipeline object to pipeline proto. However, for multi-language pipelines, we
don't know the runner when we expand external transforms in the expansion
service (we just assume open-source runners). We need to update the environment
for external transforms when the information about the runner is available.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]