[ 
https://issues.apache.org/jira/browse/BEAM-3645?focusedWorklogId=264356&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-264356
 ]

ASF GitHub Bot logged work on BEAM-3645:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 21/Jun/19 05:15
            Start Date: 21/Jun/19 05:15
    Worklog Time Spent: 10m 
      Work Description: Hannah-Jiang commented on pull request #8872: 
[BEAM-3645] add ParallelBundleManager
URL: https://github.com/apache/beam/pull/8872#discussion_r296097383
 
 

 ##########
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner.py
 ##########
 @@ -139,6 +139,25 @@ def done(self):
       self._state = self.DONE_STATE
 
 
+class _PartitionBuffer(object):
+  """This class is created to support partition(n) function."""
+  def __init__(self, inputs):
+    self._inputs = inputs
+
+  def partition(self, n):
+    v = list(self._inputs.values())[0]
+    if isinstance(v, list):
+      return [self._inputs]
 
 Review comment:
   I tried to split it, however, `test_pardo_timers` and other tests using 
`_run_pardo_state_timers()` cannot pass. A pattern I observed is when we have a 
timer, it has two elements at inputs and should be passed together, otherwise, 
the pipeline is hanging. How do we work around these cases?
   
   an input example with a timer:
   ```
   [{'Create/Read/Impulse': [b'\x7f\xdf;dZ\x1c\xac\t\x00\x00\x00\x01\x0f\x00'], 
'ParDo(TimerDoFn)_timers_to_read_timer/Read': []}]
   ```
   This is splitted into 
   ```
   [{'Create/Read/Impulse': 
[b'\x7f\xdf;dZ\x1c\xac\t\x00\x00\x00\x01\x0f\x00']}, 
{'ParDo(TimerDoFn)_timers_to_read_timer/Read': []}]
   ```
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 264356)
    Time Spent: 8h  (was: 7h 50m)

> Support multi-process execution on the FnApiRunner
> --------------------------------------------------
>
>                 Key: BEAM-3645
>                 URL: https://issues.apache.org/jira/browse/BEAM-3645
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-core
>    Affects Versions: 2.2.0, 2.3.0
>            Reporter: Charles Chen
>            Assignee: Hannah Jiang
>            Priority: Major
>          Time Spent: 8h
>  Remaining Estimate: 0h
>
> https://issues.apache.org/jira/browse/BEAM-3644 gave us a 15x performance 
> gain over the previous DirectRunner.  We can do even better in multi-core 
> environments by supporting multi-process execution in the FnApiRunner, to 
> scale past Python GIL limitations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to