tvalentyn commented on a change in pull request #14189:
URL: https://github.com/apache/beam/pull/14189#discussion_r599651690
##########
File path: sdks/python/apache_beam/runners/dataflow/internal/apiclient.py
##########
@@ -740,6 +719,25 @@ def _apply_sdk_environment_overrides(
new_payload.container_image = new_container_image
environment.payload = new_payload.SerializeToString()
+ # De-dup environments by Docker container image since currently Dataflow
Review comment:
Thanks for tagging me on this change. I am working on a related change
to reflect pipeline resource hints in portable pipeline representation. Hints
are defined in `Environment.resource_hints`. Transforms are mapped to
environments, and different transforms can have different hints. Therefore, we
can have multiple environments with different hints, but the same container
image.
My change is not yet ready to review, but the replication logic looks like
this:
https://github.com/apache/beam/pull/14082/files#diff-252b68d1b24f6f7cdd8c5e54163d4856afad59fd385f5f6a91bf0fe66f09e67dR243
I think deduplicating logic as proposed in this change will be difficult to
reconcile with resource hints representation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]