[ 
https://issues.apache.org/jira/browse/FLINK-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194915#comment-16194915
 ] 

Stephan Ewen commented on FLINK-7266:
-------------------------------------

True, this is a problem in 1.3.2 - the tradeoff was to either have a very large 
amount of redundant requests for directory emptiness check (which cause the 
checkpointing to stall or be throttled) or to leave the "directories".

In Flink 1.4 we want to fix this by letting the checkpoints understand the file 
structure and make it a single call to drop the directory, as Steve suggested.
The current abstraction is overly generic (just things in arbitrary byte 
chunks) and does not understand that checkpoint files cluster together in 
directories.

> Don't attempt to delete parent directory on S3
> ----------------------------------------------
>
>                 Key: FLINK-7266
>                 URL: https://issues.apache.org/jira/browse/FLINK-7266
>             Project: Flink
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.3.1
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>            Priority: Critical
>             Fix For: 1.4.0, 1.3.2
>
>
> Currently, every attempted release of an S3 state object also checks if the 
> "parent directory" is empty and then tries to delete it.
> Not only is that unnecessary on S3, but it is prohibitively expensive and for 
> example causes S3 to throttle calls by the JobManager on checkpoint cleanup.
> The {{FileState}} must only attempt parent directory cleanup when operating 
> against real file systems, not when operating against object stores.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to