GitHub user tillrohrmann opened a pull request:
https://github.com/apache/flink/pull/4844
[FLINK-7844] [ckPt] Fail unacknowledged pending checkpoints for fine
grained recovery
## What is the purpose of the change
This commit will fail all pending checkpoints which have not been
acknowledged by
the failed task in case of fine grained recovery. This is done in order to
avoid
long checkpoint timeouts which might block the CheckpointCoordinator from
triggering
new checkpoints.
## Brief change log
- Introduce `CheckpointCoordinator#failUnacknowledgedPendingCheckpointsFor`
to fail all unacknowledged pending checkpoints for a given `ExecutionAttemptID`
- Fail unacknowledged pending checkpoints in
`ExecutionGraph#notifyExecutionChange`
## Verifying this change
-
`IndividualRestartsConcurrencyTest#testLocalFailureFailsPendingCheckpoints`
tests that unacknowledged pending checkpoints are discarded and removed from
the `CheckpointCoordinator` in case of a local failure
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): (no)
- The public API, i.e., is any changed class annotated with
`@Public(Evolving)`: (no)
- The serializers: (no)
- The runtime per-record code paths (performance sensitive): (no)
- Anything that affects deployment or recovery: JobManager (and its
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
## Documentation
- Does this pull request introduce a new feature? (no)
- If yes, how is the feature documented? (not applicable)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tillrohrmann/flink
failCheckpointsFineGrainedRecovery
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/4844.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4844
----
commit 454066b2f606f83e159e553506d58ce3a49a256d
Author: Till <[email protected]>
Date: 2017-10-17T08:57:37Z
[FLINK-7844] [ckPt] Fail unacknowledged pending checkpoints for fine
grained recovery
This commit will fail all pending checkpoints which have not been
acknowledged by
the failed task in case of fine grained recovery. This is done in order to
avoid
long checkpoint timeouts which might block the CheckpointCoordinator from
triggering
new checkpoints
----
---