[ 
https://issues.apache.org/jira/browse/BEAM-9639?focusedWorklogId=422451&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422451
 ]

ASF GitHub Bot logged work on BEAM-9639:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 15/Apr/20 00:17
            Start Date: 15/Apr/20 00:17
    Worklog Time Spent: 10m 
      Work Description: pabloem commented on pull request #11270: 
[BEAM-9639][BEAM-9608] Improvements for FnApiRunner
URL: https://github.com/apache/beam/pull/11270#discussion_r408508964
 
 

 ##########
 File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py
 ##########
 @@ -75,21 +75,27 @@
 
 IMPULSE_BUFFER = b'impulse'
 
+# SideInputId is identified by a consumer ParDo + tag.
+SideInputId = Tuple[str, str]
+
+DataSideInput = Dict[SideInputId,
+                     Tuple[bytes, beam_runner_api_pb2.FunctionSpec]]
+
 
 class Stage(object):
   """A set of Transforms that can be sent to the worker for processing."""
   def __init__(self,
                name,  # type: str
                transforms,  # type: List[beam_runner_api_pb2.PTransform]
-               downstream_side_inputs=None,  # type: Optional[FrozenSet[str]]
+               downstream_side_inputs=None,  # type: Optional[Dict[str, 
SideInputId]]
 
 Review comment:
   Hm so this change breaks that, so the memory requirements would be larger. I 
would think that they would not be too bad, since most graphs don't have many 
side inputs going many places. What do you think? I'm willing to find a better 
solution for this, but I wonder if it's worth the extra time.
   
   The reason that this is made into a dict is to contain more information 
about downstream side inputs. specifically, it contains which transforms will 
consume the side inputs. this is used to commit the side inputs to state after 
they are calculated (rather than before they are consumed). This will be 
necessary for streaming, because side inputs will need to be added to state as 
they are computed.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 422451)
    Time Spent: 3h 40m  (was: 3.5h)

> Abstract bundle execution logic from stage execution logic
> ----------------------------------------------------------
>
>                 Key: BEAM-9639
>                 URL: https://issues.apache.org/jira/browse/BEAM-9639
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-core
>            Reporter: Pablo Estrada
>            Assignee: Pablo Estrada
>            Priority: Major
>          Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> The FnApiRunner currently works on a per-stage manner, and does not abstract 
> single-bundle execution much. This work item is to clearly define the code to 
> execute a single bundle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to