[ 
https://issues.apache.org/jira/browse/BEAM-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894007#comment-16894007
 ] 

Kenneth Knowles commented on BEAM-7785:
---------------------------------------

You mentioned maybe more clever locking. It might work, but the problem is that 
you can only update the watermark when there is no currently executing bundle 
that is going to observe it. That seems like it will be a bottleneck no matter 
what. Without making a clever data structure, you could always just do a whole 
copy of the watermark state when you start processing a bundle. It is not 
efficient, but the DirectRunner is already pretty slow.

As one similar reference point, the state became somewhat complicated 
copy-on-write solution for basically the same reason.

> DirectRunner: watermarks are updated asynchronously from bundle processing
> --------------------------------------------------------------------------
>
>                 Key: BEAM-7785
>                 URL: https://issues.apache.org/jira/browse/BEAM-7785
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-direct
>    Affects Versions: 2.13.0
>            Reporter: Jan Lukavský
>            Assignee: Jan Lukavský
>            Priority: Major
>             Fix For: 2.15.0
>
>          Time Spent: 4h
>  Remaining Estimate: 0h
>
> Watermarks are updated in QuiescenceDriver (by calling fireTimers, which 
> calls forceRefresh()) on WatermarkManager. This results in creating timer 
> bundles, that are then processed asynchronously as DirectTransformExecutor. 
> Because of that, watermarks (input watermarks mostly) might be updated while 
> bundle is being processed. That violates assumption, that bundle processing 
> should be atomical (with identical external conditions during processing of 
> whole bundle).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to