[
https://issues.apache.org/jira/browse/BEAM-9308?focusedWorklogId=395475&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-395475
]
ASF GitHub Bot logged work on BEAM-9308:
----------------------------------------
Author: ASF GitHub Bot
Created on: 29/Feb/20 15:54
Start Date: 29/Feb/20 15:54
Worklog Time Spent: 10m
Work Description: steveniemitz commented on issue #10852: [BEAM-9308]
Decorrelate state cleanup timers
URL: https://github.com/apache/beam/pull/10852#issuecomment-592959333
> maybe we need to explore the prioritization issue a bit more.
Agreed, I think ideally the state cleanup timers would have a (much?) lower
priority than everything else so they don't starve out more important "user"
work.
> Is this. a blocker for. you? If so then. maybe we can add a parameter to
DataflowPipelineOptions to control this so we don't take the risk of changing
the default behavior without more data.
We run our own fork of the anyways, so it's not particularly a blocker here.
I mostly just intended this PR as a conversation starter.
I am curious about your comment above though ("We currently rely on the
state cleanup timer for watermark holds"). From what I've observed in the
code, the state cleanup is set for after the window end, so delaying it
slightly more shouldn't cause any correctness issues, correct?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 395475)
Time Spent: 1h 40m (was: 1.5h)
> Optimize state cleanup at end-of-window
> ---------------------------------------
>
> Key: BEAM-9308
> URL: https://issues.apache.org/jira/browse/BEAM-9308
> Project: Beam
> Issue Type: Improvement
> Components: runner-dataflow
> Reporter: Steve Niemitz
> Assignee: Steve Niemitz
> Priority: Major
> Time Spent: 1h 40m
> Remaining Estimate: 0h
>
> When using state with a large keyspace, you can end up with a large amount of
> state cleanup timers set to fire all 1ms after the end of a window. This can
> cause a momentary (I've observed 1-3 minute) lag in processing while windmill
> and the java harness fire and process these cleanup timers.
> By spreading the firing over a short period after the end of the window, we
> can decorrelate the firing of the timers and smooth the load out, resulting
> in much less impact from state cleanup.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)