[ 
https://issues.apache.org/jira/browse/BEAM-9308?focusedWorklogId=391029&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391029
 ]

ASF GitHub Bot logged work on BEAM-9308:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Feb/20 02:24
            Start Date: 22/Feb/20 02:24
    Worklog Time Spent: 10m 
      Work Description: steveniemitz commented on issue #10852: [BEAM-9308] 
Decorrelate state cleanup timers
URL: https://github.com/apache/beam/pull/10852#issuecomment-589907688
 
 
   Yay thanks for looking at this.  I'll address your points in reverse order :P
   
   > Maybe we need a better prioritization strategy so that large #s of timers 
don't starve out elements?
   
   I think that'd be the best overall option, but ideally we'd have variable 
priority.  ie, state cleanup timers should be low priority, while user timers 
should be the same priority as "normal" elements.  In the end though, if we end 
up with state cleanup timers delayed by N minutes because they are 
deprioritized, that seems like we'd be in the same spot as explicitly 
decorrelating them here.
   
   > Delaying the timer will also prevent downstream aggregations from firing. 
3 minutes could cause issues if the window itself is much smaller.
   
   Agreed, I sort of touched on this on my comment about letting the duration 
be configurable.  Ideally it'd be some fraction of the window duration itself. 
   
   I'm not sure it actually will delay the downstream aggregations from firing 
however, since the firing time it set to after the window closes (maxTimestamp 
+ allowedLateness + 1ms), so once these begin firing, the watermark has already 
passed the end of the window.  Or am I misunderstanding something here?
   
   > We want to reuse this timer for OnWindowExpiration, and this will delay 
all those callbacks as well.
   
   I'd actually argue that's preferable, since you'd have the same problem 
there was well (potentially millions of timers firing at the same time).
   
   > We currently rely on the state cleanup timer for watermark holds.
   
   Is this true?  The state cleanup timer is already set past the end of the 
window, so by the time the timer fires the window has already closed.
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 391029)
    Time Spent: 50m  (was: 40m)

> Optimize state cleanup at end-of-window
> ---------------------------------------
>
>                 Key: BEAM-9308
>                 URL: https://issues.apache.org/jira/browse/BEAM-9308
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-dataflow
>            Reporter: Steve Niemitz
>            Assignee: Steve Niemitz
>            Priority: Major
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> When using state with a large keyspace, you can end up with a large amount of 
> state cleanup timers set to fire all 1ms after the end of a window.  This can 
> cause a momentary (I've observed 1-3 minute) lag in processing while windmill 
> and the java harness fire and process these cleanup timers.
> By spreading the firing over a short period after the end of the window, we 
> can decorrelate the firing of the timers and smooth the load out, resulting 
> in much less impact from state cleanup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to