lukecwik commented on a change in pull request #11060: [BEAM-9454] Create 
Deduplication transform based on user timer/state
URL: https://github.com/apache/beam/pull/11060#discussion_r391728602
 
 

 ##########
 File path: sdks/python/apache_beam/runners/sdf_utils.py
 ##########
 @@ -244,3 +251,63 @@ def get_estimator_state(self):
         return None
 
     return _NoOpWatermarkEstimator()
+
+
+class DeduplictaionWithinDuration(ptransform.PTransform):
 
 Review comment:
   It would be useful to expose a keyed deduplication transform as the common 
implementation that all use internally so in the future we can turn into a well 
known URN and then runners could provide optimized deduplication transform 
implementations.
   
   We want pipeline authors to use this transform and I think it should go into 
sdks/python/apache_beam/transforms/util.py or into a dedicated file such as 
sdks/python/apache_beam/transforms/ such as deduplicate.py.
   CC: @udim What do you think?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to