[jira] [Commented] (FLINK-27155) Reduce multiple reads to the same Changelog file in the same taskmanager during restore

Feifan Wang (Jira) Fri, 22 Apr 2022 08:32:03 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526494#comment-17526494
 ]


Feifan Wang commented on FLINK-27155:
-------------------------------------

I think task in RUNNING state not mean we can clean up the cache file, because 
changelog download and applying to delegated backend is in RUNNING state.

 

As for the local space to store changelog cache file, I think it should be fine 
in most scenarios, can you describe some scenarios in which the changelog might 
be too large to fit on the local disk ? Or should we provide an option to limit 
the total size of the cache files ?

 

Add another point, I think we can save the decompressed content to local cache 
file, so that we can seek to change set start position efficiently (use file 
seek rather than read all previous bytes which cause IO amplification). 

> Reduce multiple reads to the same Changelog file in the same taskmanager 
> during restore
> ---------------------------------------------------------------------------------------
>
>                 Key: FLINK-27155
>                 URL: https://issues.apache.org/jira/browse/FLINK-27155
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / State Backends
>            Reporter: Feifan Wang
>            Assignee: Feifan Wang
>            Priority: Major
>             Fix For: 1.16.0
>
>
> h3. Background
> In the current implementation, State changes of different operators in the 
> same taskmanager may be written to the same changelog file, which effectively 
> reduces the number of files and requests to DFS.
> But on the other hand, the current implementation also reads the same 
> changelog file multiple times on recovery. More specifically, the number of 
> times the same changelog file is accessed is related to the number of 
> ChangeSets contained in it. And since each read needs to skip the preceding 
> bytes, this network traffic is also wasted.
> The result is a lot of unnecessary request to DFS when there are multiple 
> slots and keyed state in the same taskmanager.
> h3. Proposal
> We can reduce multiple reads to the same changelog file in the same 
> taskmanager during restore.
> One possible approach is to read the changelog file all at once and cache it 
> in memory or local file for a period of time when reading the changelog file.
> I think this could be a subtask of [v2 FLIP-158: Generalized incremental 
> checkpoints|https://issues.apache.org/jira/browse/FLINK-25842] .
> Hi [~ym] , [~roman]  how do you think about ?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-27155) Reduce multiple reads to the same Changelog file in the same taskmanager during restore

Reply via email to