GitHub user StephanEwen opened a pull request:
https://github.com/apache/flink/pull/2920
[FLINK-5218] [state backends] Eagerly close checkpoint streams on
cancellation
When a task is canceled during a checkpoint operation, the operation needs
to cancel fast.
This is a forward fis from version 1.1, where checkpoints could get stuck
when the state output streams did not handle interruptions correctly (HDFS has
that problem).
Most of this is already handled in version 1.2 via the *CloseableRegistry*.
This adds a test to validate this case is handled correctly and adds minor
changes to make it work reliably, like:
- fail fast on `write()` on closed checkpoint streams
- fail fast on `flush()` on closed checkpoint streams
- slight optimization to save a flag in the checkpoint streams
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/StephanEwen/incubator-flink closing_validation
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/2920.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2920
commit e592c098f25f97b223f07ff84cd2fd9233e36dc4
Author: Stephan Ewen
Date: 2016-12-01T16:12:12Z
[FLINK-5218] [state backends] Add test that validates that Checkpoint
Streams are eagerly closed on cancellation.
This is important for some stream implementations (such as HDFS) that do
not properly
handle thread interruption.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---