[ 
https://issues.apache.org/jira/browse/BEAM-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17122752#comment-17122752
 ] 

Beam JIRA Bot commented on BEAM-7745:
-------------------------------------

This issue is P2 but has been unassigned without any comment for 60 days so it 
has been labeled "stale-P2". If this issue is still affecting you, we care! 
Please comment and remove the label. Otherwise, in 14 days the issue will be 
moved to P3.

Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed 
explanation of what these priorities mean.


> StreamingSideInputDoFnRunner/StreamingSideInputFetcher have suboptimal state 
> access pattern during normal operation
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-7745
>                 URL: https://issues.apache.org/jira/browse/BEAM-7745
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-dataflow
>            Reporter: Steve Niemitz
>            Priority: P2
>              Labels: stale-P2
>
> I spent some time tracking down sources of uncached state fetches in my job, 
> and one large category was the interaction of StreamingSideInputDoFnRunner + 
> StreamingSideInputFetcher.
> Basically, during standard operations, when the main input is NOT blocked by 
> the side input, the side input fetcher will perform an uncached state read 
> for every input element.  Changing it to cache the blockedMap state gave me a 
> ~30-40% increase in throughput in my job.
> The interaction is a little complicated, and there's a couple optimizations 
> here I can see.
>  
> Primarily, the blockedMap is only persisted if it is non-empty.  Because the 
> WindmillStateCache won't cache a null value, this means that the "nothing is 
> blocked" signal is never actually cached, and will issue a state read to 
> windmill for each input element.  The solution here seems like it is to 
> persist an empty map rather than a null when there are no blocked elements.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to