tvalentyn commented on code in PR #25313:
URL: https://github.com/apache/beam/pull/25313#discussion_r1098248699


##########
sdks/python/apache_beam/runners/worker/bundle_processor.py:
##########
@@ -823,6 +825,38 @@ def only_element(iterable):
   return element
 
 
+def _extract_py_version(capability):
+  return capability[len(_PY_SDK_CAPABILITY_PREFIX):].split("_sdk:")[0]
+
+
+def _verify_descriptor_created_in_a_compatible_env(process_bundle_descriptor):
+  # type: beam_fn_api_pb2.ProcessBundleDescriptor -> None
+
+  py_envs_at_submission = set()
+  for _, env in process_bundle_descriptor.environments.items():

Review Comment:
   > Rather than enumerating over all environments, just enumerate over the 
environments associated with the transforms of this bundle descriptor such that 
we can require an exact match.
   
   Made the changes, logic is simpler now. However: looks like Dataflow's 
runner v2 is not setting environment_id field on transforms in 
ProcessBundleDescriptor. Perhaps, due to an implicit assumption that there is 
only 1 environment per SDK harness? I also noticed that UW harness starts with 
a mapping of {SDK_harness_id: environment_id}. However this mapping is not 
visible to SDK harness. There is still a list of environments in PBD.  
    
    Works with PortableRunner.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to