[ 
https://issues.apache.org/jira/browse/FLINK-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16847677#comment-16847677
 ] 

Yun Tang commented on FLINK-10855:
----------------------------------

Thanks [~yanghua] to start this Jira again. I just wonder whether a checkpoint 
cleaner could really solve FLINK-11662 . If we have 1000 tasks here and task-1 
just declined the checkpoint but other tasks would keep checkpointing. The 
timing to clear the checkpoint directory is a tough problem since we cannot 
guarantee other tasks would create that parent checkpoint directory again. 
Generally, filesystem like HDFS would create the non-existing parent folder 
when writing to a file in the sub folder. Then should we need to clean up the 
parent folder again at what time?

I planed to refactor the checkpoint directory layout in FLINK-10930 but seems 
not so widely accepted.

 

If we put FLINK-11662 aside to ignore that bug and focus on the checkpoint 
cleaner logic, we already implemented an version of cleanup in our internal 
Flink last year.There exist several points to implement this based on our 
experience:
 # Reduce the cost of listing files: I already created a issue before: 
FLINK-11868 .
 # When to list files: at least when job failover or region failover.
 # When to delete useless files: we found delete would take more time than 
listing, it should stay in the async thread.
 # Scan files synchronously or asynchronously: asynchronously scan would not 
block theJM main thread but could not delete files in time.

Since you assign this issue to yourself too eagerly but did not take any 
progress until now, and from the other JIRA's lesson, I think I could also 
attach a design doc based on our internal version of cleaner.

 

 

 

> CheckpointCoordinator does not delete checkpoint directory of late/failed 
> checkpoints
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-10855
>                 URL: https://issues.apache.org/jira/browse/FLINK-10855
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.5.5, 1.6.2, 1.7.0
>            Reporter: Till Rohrmann
>            Assignee: vinoyang
>            Priority: Major
>
> In case that an acknowledge checkpoint message is late or a checkpoint cannot 
> be acknowledged, we discard the subtask state in the 
> {{CheckpointCoordinator}}. What's not happening in this case is that we 
> delete the parent directory of the checkpoint. This only happens when we 
> dispose a {{PendingCheckpoint#dispose}}. 
> Due to this behaviour it can happen that a checkpoint fails (e.g. a task not 
> being ready) and we delete the checkpoint directory. Next another task writes 
> its checkpoint data to the checkpoint directory (thereby creating it again) 
> and sending an acknowledge message back to the {{CheckpointCoordinator}}. The 
> {{CheckpointCoordinator}} will realize that there is no longer a 
> {{PendingCheckpoint}} and will discard the sub task state. This will remove 
> the state files from the checkpoint directory but will leave the checkpoint 
> directory untouched.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to