damccorm commented on code in PR #29642:
URL: https://github.com/apache/beam/pull/29642#discussion_r1418012439


##########
sdks/python/apache_beam/transforms/util.py:
##########
@@ -748,6 +749,32 @@ def flush_batch(
   return _StatefulBatchElementsDoFn()
 
 
+class SharedKey():
+  """A class that holds a per-process UUID used to key elements for streaming
+  BatchElements. 
+  """
+  def __init__(self):
+    self.key = uuid.uuid4().hex
+
+
+def load_shared_key():
+  return SharedKey()
+
+
+class WithSharedKey(DoFn):
+  """A DoFn that keys elements with a per-process UUID. Used in streaming
+  BatchElements.  
+  """
+  def __init__(self):
+    self.shared_handle = shared.Shared()
+
+  def setup(self):
+    self.key = self.shared_handle.acquire(load_shared_key).key

Review Comment:
   I think we should pass in a tag (could be `'WithSharedKey'`), that way we 
avoid conflicts if another DoFn happens to load a shared object without a tag.
   
   Not super important or likely to happen, but probably worth doing just in 
case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to