[
https://issues.apache.org/jira/browse/BEAM-9692?focusedWorklogId=422370&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422370
]
ASF GitHub Bot logged work on BEAM-9692:
----------------------------------------
Author: ASF GitHub Bot
Created on: 14/Apr/20 22:00
Start Date: 14/Apr/20 22:00
Worklog Time Spent: 10m
Work Description: robertwb commented on pull request #11335: [BEAM-9692]:
Make CombineValues portable
URL: https://github.com/apache/beam/pull/11335#discussion_r408461337
##########
File path: sdks/python/apache_beam/runners/dataflow/ptransform_overrides.py
##########
@@ -111,3 +111,38 @@ def expand(self, pbegin):
return JrhRead().with_output_types(
ptransform.get_type_hints().simple_output_type('Read'))
+
+
+class CombineValuesPTransformOverride(PTransformOverride):
+ """A ``PTransformOverride`` for ``CombineValues``.
+
+ The DataflowRunner expects that the CombineValues PTransform acts as a
+ primitive. So this override replaces the CombineValues with a primitive.
+ """
+ def matches(self, applied_ptransform):
+ # Imported here to avoid circular dependencies.
+ # pylint: disable=wrong-import-order, wrong-import-position
+ from apache_beam import CombineValues
+
+ if isinstance(applied_ptransform.transform, CombineValues):
+ self.transform = applied_ptransform.transform
+ return True
+ return False
+
+ def get_replacement_transform(self, ptransform):
+ # Imported here to avoid circular dependencies.
+ # pylint: disable=wrong-import-order, wrong-import-position
+ from apache_beam import PTransform
+ from apache_beam.pvalue import PCollection
+
+ # The DataflowRunner still needs access to the CombineValues members to
Review comment:
I was thinking that run_xxx could also be called for composites. That might,
however, be a bigger change, so we can go with this approach.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 422370)
Time Spent: 1.5h (was: 1h 20m)
> Clean Python DataflowRunner to use portable pipelines
> -----------------------------------------------------
>
> Key: BEAM-9692
> URL: https://issues.apache.org/jira/browse/BEAM-9692
> Project: Beam
> Issue Type: Improvement
> Components: runner-dataflow
> Reporter: Sam Rohde
> Assignee: Sam Rohde
> Priority: Major
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)