[ 
https://issues.apache.org/jira/browse/FLINK-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15701881#comment-15701881
 ] 

ASF GitHub Bot commented on FLINK-5007:
---------------------------------------

Github user uce commented on the issue:

    https://github.com/apache/flink/pull/2750
  
    @StephanEwen Do you have time to look at this? Currently, when externalized 
checkpoints are configured and the cluster shuts down via suspending all jobs, 
the externalized checkpoints are cleaned up. This PR proposes to handle 
suspension like a cancellation and respect the corresponding cleanup 
configuration, e.g. retain if `RETAIN_ON_CANCELLATION` and delete if 
`DELETE_ON_CANCELLATION`.


> Retain externalized checkpoint on suspension
> --------------------------------------------
>
>                 Key: FLINK-5007
>                 URL: https://issues.apache.org/jira/browse/FLINK-5007
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>            Reporter: Ufuk Celebi
>            Assignee: Ufuk Celebi
>             Fix For: 1.2.0
>
>
> Externalized checkpoints are cleaned up when the job is suspended. 
> Suspensions happen on graceful shut down (non-HA) or loss of leadership (HA).
> In case of HA, the checkpoint store does not clean up any checkpoints as they 
> might be recovered by a new leader. The only way to stop a HA job is to 
> actually cancel it. Therefore the configured clean up behaviour doesn't 
> matter.
> In case of non-HA, suspensions happen because of graceful shut down (for 
> example stopping a YARN session). In this case I would treat the clean up 
> behaviour similar to cancelling the job.
> {code}
> ExternalizedCheckpointCleanup.DELETE_ON_CANCELLATION => delete on suspension
> ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION => retain on suspension
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to