ihji commented on code in PR #22230:
URL: https://github.com/apache/beam/pull/22230#discussion_r919455115


##########
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java:
##########
@@ -898,8 +898,19 @@ protected RunnerApi.Pipeline applySdkEnvironmentOverrides(
           throw new RuntimeException("Error parsing environment docker 
payload.", e);
         }
         String containerImage = dockerPayload.getContainerImage();
+        boolean updated = false;
         for (int i = 0; i < overrides.length; i += 2) {
           containerImage = containerImage.replaceAll(overrides[i], overrides[i 
+ 1]);
+          if (!containerImage.equals(dockerPayload.getContainerImage())) {
+            updated = true;
+          }
+        }
+        if (containerImage.startsWith("apache/beam")

Review Comment:
   It updates foreign images including Python and Go but not a default Java 
image set by current SDK environment.
   
   I think this can be done by the service too but I think it's safer and has 
more flexibility when we configure the appropriate URLs from the SDK. Usually 
we don't encounter this problem since we know the runner by the time we convert 
pipeline object to pipeline proto. However, for multi-language pipelines, we 
don't know the runner when we expand external transforms in the expansion 
service (we just assume open-source runners). We need to update the environment 
for external transforms when the information about the runner is available.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to