[ 
https://issues.apache.org/jira/browse/BEAM-9692?focusedWorklogId=422155&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422155
 ]

ASF GitHub Bot logged work on BEAM-9692:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 14/Apr/20 16:55
            Start Date: 14/Apr/20 16:55
    Worklog Time Spent: 10m 
      Work Description: rohdesamuel commented on pull request #11335: 
[BEAM-9692]: Make CombineValues portable
URL: https://github.com/apache/beam/pull/11335#discussion_r408289779
 
 

 ##########
 File path: sdks/python/apache_beam/runners/dataflow/ptransform_overrides.py
 ##########
 @@ -111,3 +111,38 @@ def expand(self, pbegin):
 
     return JrhRead().with_output_types(
         ptransform.get_type_hints().simple_output_type('Read'))
+
+
+class CombineValuesPTransformOverride(PTransformOverride):
+  """A ``PTransformOverride`` for ``CombineValues``.
+
+  The DataflowRunner expects that the CombineValues PTransform acts as a
+  primitive. So this override replaces the CombineValues with a primitive.
+  """
+  def matches(self, applied_ptransform):
+    # Imported here to avoid circular dependencies.
+    # pylint: disable=wrong-import-order, wrong-import-position
+    from apache_beam import CombineValues
+
+    if isinstance(applied_ptransform.transform, CombineValues):
+      self.transform = applied_ptransform.transform
+      return True
+    return False
+
+  def get_replacement_transform(self, ptransform):
+    # Imported here to avoid circular dependencies.
+    # pylint: disable=wrong-import-order, wrong-import-position
+    from apache_beam import PTransform
+    from apache_beam.pvalue import PCollection
+
+    # The DataflowRunner still needs access to the CombineValues members to
 
 Review comment:
   Sorry, I don't understand what you mean. Can you elaborate? What does "let 
try and find methods for composites as well" mean? What would an alternative to 
PTransformOverrides be?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 422155)
    Time Spent: 1h 20m  (was: 1h 10m)

> Clean Python DataflowRunner to use portable pipelines
> -----------------------------------------------------
>
>                 Key: BEAM-9692
>                 URL: https://issues.apache.org/jira/browse/BEAM-9692
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-dataflow
>            Reporter: Sam Rohde
>            Assignee: Sam Rohde
>            Priority: Major
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to