[
https://issues.apache.org/jira/browse/FLINK-5214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15712443#comment-15712443
]
ASF GitHub Bot commented on FLINK-5214:
---------------------------------------
GitHub user tillrohrmann opened a pull request:
https://github.com/apache/flink/pull/2918
[FLINK-5214] Clean up checkpoint data in case of a failing checkpoint
operation
Adds exception handling to the stream operators for the snapshotState
method. In case of an
exception while performing the snapshot operation, all until then
checkpointed data will
be discarded/deleted. This makes sure that a failing checkpoint operation
won't leave
orphaned checkpoint data (e.g. files) behind.
Add test case for FsCheckpointStateOutputStream
Add RocksDB FullyAsyncSnapshot cleanup test
Add proper state cleanup tests for window operator
Add state cleanup test for failing snapshot call of
AbstractUdfStreamOperator
cc @StephanEwen
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tillrohrmann/flink fixTaskCheckpointFailure
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/2918.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2918
----
commit 35fc74dd501fc49aa0b55f415c85c2140206220a
Author: Till Rohrmann <[email protected]>
Date: 2016-12-01T12:25:05Z
[FLINK-5214] Clean up checkpoint data in case of a failing checkpoint
operation
Adds exception handling to the stream operators for the snapshotState
method. In case of an
exception while performing the snapshot operation, all until then
checkpointed data will
be discarded/deleted. This makes sure that a failing checkpoint operation
won't leave
orphaned checkpoint data (e.g. files) behind.
Add test case for FsCheckpointStateOutputStream
Add RocksDB FullyAsyncSnapshot cleanup test
Add proper state cleanup tests for window operator
Add state cleanup test for failing snapshot call of
AbstractUdfStreamOperator
----
> Clean up checkpoint files when failing checkpoint operation on TM
> -----------------------------------------------------------------
>
> Key: FLINK-5214
> URL: https://issues.apache.org/jira/browse/FLINK-5214
> Project: Flink
> Issue Type: Bug
> Components: TaskManager
> Affects Versions: 1.2.0, 1.1.3
> Reporter: Till Rohrmann
> Assignee: Till Rohrmann
> Fix For: 1.2.0, 1.1.4
>
>
> When the {{StreamTask#performCheckpoint}} operation fails on a
> {{TaskManager}} potentially created checkpoint files are not cleaned up. This
> should be changed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)