[ 
https://issues.apache.org/jira/browse/BEAM-3736?focusedWorklogId=506267&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506267
 ]

ASF GitHub Bot logged work on BEAM-3736:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 29/Oct/20 16:26
            Start Date: 29/Oct/20 16:26
    Worklog Time Spent: 10m 
      Work Description: kamilwu commented on a change in pull request #13048:
URL: https://github.com/apache/beam/pull/13048#discussion_r514395054



##########
File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
##########
@@ -411,6 +411,33 @@ def visit_transform(self, transform_node):
 
     return FlattenInputVisitor()
 
+  @staticmethod
+  def combinefn_visitor():
+    # Imported here to avoid circular dependencies.
+    from apache_beam.pipeline import PipelineVisitor
+    from apache_beam import core
+
+    class CombineFnVisitor(PipelineVisitor):
+      """Checks if `CombineFn` has non-default setup or teardown methods.
+      If yes, raises `ValueError`.
+      """
+      def visit_transform(self, applied_transform):
+        transform = applied_transform.transform
+        if isinstance(transform, core.ParDo) and isinstance(
+            transform.fn, core.CombineValuesDoFn):
+          if self._overrides_setup_or_teardown(transform.fn.combinefn):
+            raise ValueError(
+                'CombineFn.setup and CombineFn.teardown are '
+                'not supported with non-portable Dataflow '
+                'runner. Please use Dataflow Runner V2 instead.')

Review comment:
       I think the question is for Dataflow team. From my perspective, I think 
there's no such need to support this in non-portable Dataflow, given that new 
batch pipelines will start using Dataflow Runner V2 in a month (December 4). 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 506267)
    Time Spent: 8h 10m  (was: 8h)

> Add SetUp() and TearDown() for CombineFns
> -----------------------------------------
>
>                 Key: BEAM-3736
>                 URL: https://issues.apache.org/jira/browse/BEAM-3736
>             Project: Beam
>          Issue Type: Improvement
>          Components: beam-model, sdk-py-core
>            Reporter: Chuan Yu Foo
>            Assignee: Kamil Wasilewski
>            Priority: P3
>          Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> I have a CombineFn that has a large amount of state that needs to be loaded 
> once before it can add_input or merge_combiners (for example, the CombineFn 
> might load up a large lookup table used for combining). 
> Right now, to initialise this state, for each of the methods, I check if the 
> state has already been initialised, and if not, I initialise it. It would be 
> nice if CombineFn provided a SetUp() method that is called once to initialise 
> this state (and a corresponding TearDown() method to clean up this state if 
> necessary).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to