[ 
https://issues.apache.org/jira/browse/BEAM-3645?focusedWorklogId=261594&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-261594
 ]

ASF GitHub Bot logged work on BEAM-3645:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Jun/19 18:00
            Start Date: 17/Jun/19 18:00
    Worklog Time Spent: 10m 
      Work Description: Hannah-Jiang commented on pull request #8872: 
[BEAM-3645] add ParallelBundleManager
URL: https://github.com/apache/beam/pull/8872#discussion_r294425205
 
 

 ##########
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner_test.py
 ##########
 @@ -1185,6 +1185,32 @@ def create_pipeline(self):
   def test_register_finalizations(self):
     raise unittest.SkipTest("TODO: Avoid bundle finalizations on repeat.")
 
+class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest):
+
+  def create_pipeline(self):
+    return beam.Pipeline(
+        runner=fn_api_runner.FnApiRunner(num_workers=2))
+
+  def test_checkpoint(self):
+    raise unittest.SkipTest("Multiworker doesn't support split request.")
+
+  def test_split_half(self):
+    raise unittest.SkipTest("Multiworker doesn't support split request.")
+
+class FnApiRunnerTestWithMultiWorkersAndBundleRepeat(FnApiRunnerTest):
+
+  def create_pipeline(self):
+    return beam.Pipeline(
+        runner=fn_api_runner.FnApiRunner(num_workers=2, bundle_repeat=2))
+
+  def test_checkpoint(self):
+    raise unittest.SkipTest("Multiworker doesn't support split request.")
 
 Review comment:
   ParallelBundleManager works well with all tests except the ones with split 
manager. Without fully knowing about how split request works, I'm not sure what 
issue it is. I would like to confirm if we want to support pipelines with split 
request in this case, because splitting is another way to chunk data into 
smaller set. In addition, depends on how split manager is designed, in some 
cases, split function will not be called. For example, `test_split_half` test 
only run split when `num_elements == total_num_elements`, however, with 
ParalleBundleManager, num_elements will not be equal to total_num_elements 
because it is already chunked to N pieces.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 261594)
    Time Spent: 3h  (was: 2h 50m)

> Support multi-process execution on the FnApiRunner
> --------------------------------------------------
>
>                 Key: BEAM-3645
>                 URL: https://issues.apache.org/jira/browse/BEAM-3645
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-core
>    Affects Versions: 2.2.0, 2.3.0
>            Reporter: Charles Chen
>            Assignee: Hannah Jiang
>            Priority: Major
>          Time Spent: 3h
>  Remaining Estimate: 0h
>
> https://issues.apache.org/jira/browse/BEAM-3644 gave us a 15x performance 
> gain over the previous DirectRunner.  We can do even better in multi-core 
> environments by supporting multi-process execution in the FnApiRunner, to 
> scale past Python GIL limitations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to