tvalentyn commented on a change in pull request #14189:
URL: https://github.com/apache/beam/pull/14189#discussion_r600931278



##########
File path: sdks/python/apache_beam/runners/dataflow/internal/apiclient.py
##########
@@ -740,6 +720,28 @@ def _apply_sdk_environment_overrides(
       new_payload.container_image = new_container_image
       environment.payload = new_payload.SerializeToString()
 
+    # De-dup environments that use Java SDK by Docker container image since
+    # currently running multiple Java SDK Harnesses with Dataflow could result
+    # in dependency conflicts.
+    # TODDO(BEAM-9455): remove following restriction when Dataflow supports
+    # environment specific dependency provisioning.
+    container_url_to_env_map = dict()
+    container_url_to_env_id_map = dict()
+    for transform in proto_pipeline.components.transforms.values():
+      environment_id = transform.environment_id
+      if not environment_id:
+        continue
+      environment = proto_pipeline.components.environments[environment_id]
+      docker_payload = proto_utils.parse_Bytes(
+          environment.payload, beam_runner_api_pb2.DockerPayload)
+      image = docker_payload.container_image
+      if is_beam_java_container_name(image):

Review comment:
       Is it possible to do this rewrite later (before submitting the v1beta3 
job)? I am seeing that this code is will be called before we translate beam 
pipeline into v1beta3 Steps. It will be difficult to support resource hints for 
xlang with this rewrite. You also need to do the same in portable job 
submission codepath in DF service. I wonder if doing this rewrite on cloud 
workflow proto level (service change) would be a better path.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to