[jira] [Updated] (FLINK-23354) Limit the size of blob cache on TaskExecutor

ASF GitHub Bot (Jira) Thu, 15 Jul 2021 01:13:08 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-23354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


ASF GitHub Bot updated FLINK-23354:
-----------------------------------
    Labels: pull-request-available  (was: )

> Limit the size of blob cache on TaskExecutor
> --------------------------------------------
>
>                 Key: FLINK-23354
>                 URL: https://issues.apache.org/jira/browse/FLINK-23354
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>            Reporter: Zhilong Hong
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.14.0
>
>
> Currently a TaskExecutor uses BlobCache to cache the blobs transported from 
> JobManager. The caches are the local file stored on the TaskExecutor. The 
> blob cache will not be cleaned up until one hour after the related job is 
> finished. At present, JobInformation and TaskInformation are transported via 
> blob. If a lot of jobs are submitted, the blob cache will occupy large amount 
> of disk space. In FLINK-23218, we are going to distribute the cached 
> ShuffleDescriptors via blob. When large amount of failovers happen, there 
> will be a lot of cache stored on local disk. In extreme cases, the blob would 
> blow up the disk space.
> So we need to add a limit size for the blob cache on TaskExecutor, as 
> described in the comments of FLINK-23218. The main idea is to add a size 
> limit and and delete blobs in LRU order if the size limit is exceeded. Before 
> a blob item is cached, TaskExecutor will firstly check the overall size of 
> cache. If the overall size exceeds the limit, the blob will be deleted in LRU 
> order until the limit is not exceeded anymore. For the blob cache that is 
> deleted, if it is used afterwards, it will be downloaded from the blob server 
> again.
> The default value of the size limit of the blob cache on TaskExecutor will be 
> 10GiB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23354) Limit the size of blob cache on TaskExecutor

Reply via email to