Re: Checkpoints not removing

Till Rohrmann Fri, 08 Sep 2017 01:55:04 -0700

Hi,

you're right that this should actually happen automatically. The delete
operation is executed by an asynchronous thread and, thus, can happen a bit
later after discarding the actual checkpoint. What we have seen in the past
is that if you use for example S3, it could happen that the write and
delete operations were throttled. This caused that the delete operations
where piling up, but were still taking place eventually. Therefore, it
would be helpful to know to which file system you checkpoint the state.
Moreover, is it the case that the checkpoint files are never deleted or
only slowly?


For further debugging purposes it would be really helpful to get the log
files of the JobManager on DEBUG log level.

Cheers,
Till

On Thu, Sep 7, 2017 at 7:25 PM, rnosworthy <
[email protected]> wrote:

> Flink 1.3.2
> FileState Backend
> Currently have 1 Job Manager with 1 Task Manager
>
> I believe this should happen automatically, however there are hundreds of
> checkpoint files building up in my data directory.
>
> I have tried numerous attempts to clean up the checkpoint data via setting
> fileStateSizeThreshold when instantiating FsStateBackend object for
> environment.
>
> I have also tried to set config option 'state.checkpoints.num-retained: 5'
>
> Is there something I am doing wrong or is this a potential bug in 1.3.2?
>
> Checkpoint Config :
> Option: Value
> Checkpointing Mode:     Exactly Once
> Interval:       30s
> Timeout:        10m 0s
> Minimum Pause Between Checkpoints:      0ms
> Maximum Concurrent Checkpoints: 1
> Persist Checkpoints Externally  Disabled
>
>
>
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>

Re: Checkpoints not removing

Reply via email to