[
https://issues.apache.org/jira/browse/BEAM-9692?focusedWorklogId=421714&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421714
]
ASF GitHub Bot logged work on BEAM-9692:
----------------------------------------
Author: ASF GitHub Bot
Created on: 13/Apr/20 22:37
Start Date: 13/Apr/20 22:37
Worklog Time Spent: 10m
Work Description: robertwb commented on pull request #11335: [BEAM-9692]:
Make CombineValues portable
URL: https://github.com/apache/beam/pull/11335#discussion_r407760055
##########
File path: sdks/python/apache_beam/runners/dataflow/ptransform_overrides.py
##########
@@ -111,3 +111,38 @@ def expand(self, pbegin):
return JrhRead().with_output_types(
ptransform.get_type_hints().simple_output_type('Read'))
+
+
+class CombineValuesPTransformOverride(PTransformOverride):
+ """A ``PTransformOverride`` for ``CombineValues``.
+
+ The DataflowRunner expects that the CombineValues PTransform acts as a
+ primitive. So this override replaces the CombineValues with a primitive.
+ """
+ def matches(self, applied_ptransform):
+ # Imported here to avoid circular dependencies.
+ # pylint: disable=wrong-import-order, wrong-import-position
+ from apache_beam import CombineValues
+
+ if isinstance(applied_ptransform.transform, CombineValues):
+ self.transform = applied_ptransform.transform
+ return True
+ return False
+
+ def get_replacement_transform(self, ptransform):
+ # Imported here to avoid circular dependencies.
+ # pylint: disable=wrong-import-order, wrong-import-position
+ from apache_beam import PTransform
+ from apache_beam.pvalue import PCollection
+
+ # The DataflowRunner still needs access to the CombineValues members to
Review comment:
It would be preferable to simply let try and find methods for composites as
well, rather than using PTransformOverrides. This would likely help with the
GBK one too.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 421714)
Time Spent: 1h (was: 50m)
> Clean Python DataflowRunner to use portable pipelines
> -----------------------------------------------------
>
> Key: BEAM-9692
> URL: https://issues.apache.org/jira/browse/BEAM-9692
> Project: Beam
> Issue Type: Improvement
> Components: runner-dataflow
> Reporter: Sam Rohde
> Assignee: Sam Rohde
> Priority: Major
> Time Spent: 1h
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)