[jira] [Resolved] (FLINK-10886) Event time synchronization across sources

Thomas Weise (Jira) Fri, 25 Nov 2022 05:34:04 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-10886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Thomas Weise resolved FLINK-10886.
----------------------------------
    Resolution: Done

> Event time synchronization across sources
> -----------------------------------------
>
>                 Key: FLINK-10886
>                 URL: https://issues.apache.org/jira/browse/FLINK-10886
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Common
>            Reporter: Jamie Grier
>            Priority: Not a Priority
>              Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> auto-unassigned
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> When reading from a source with many parallel partitions, especially when 
> reading lots of historical data (or recovering from downtime and there is a 
> backlog to read), it's quite common for there to develop an event-time skew 
> across those partitions.
>  
> When doing event-time windowing -- or in fact any event-time driven 
> processing -- the event time skew across partitions results directly in 
> increased buffering in Flink and of course the corresponding state/checkpoint 
> size growth.
>  
> As the event-time skew and state size grows larger this can have a major 
> effect on application performance and in some cases result in a "death 
> spiral" where the application performance get's worse and worse as the state 
> size grows and grows.
>  
> So, one solution to this problem, outside of core changes in Flink itself, 
> seems to be to try to coordinate sources across partitions so that they make 
> progress through event time at roughly the same rate.  In fact if there is 
> large skew the idea would be to slow or even stop reading from some 
> partitions with newer data while first reading the partitions with older 
> data.  Anyway, to do this we need to share state somehow amongst sub-tasks.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-10886) Event time synchronization across sources

Reply via email to