1u0 commented on a change in pull request #9131: [FLINK-12858][checkpointing]
Stop-with-savepoint, workaround: fail whole job when savepoint is declined by a
task
URL: https://github.com/apache/flink/pull/9131#discussion_r303938502
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/LegacyScheduler.java
##########
@@ -649,4 +659,22 @@ private String
retrieveTaskManagerLocation(ExecutionAttemptID executionAttemptID
.map(TaskManagerLocation::toString)
.orElse("Unknown location");
}
+
+ private static boolean isCheckpointDeclinedException(Throwable
throwable) {
+ return ExceptionUtils.findThrowable(throwable,
CheckpointException.class)
+ .map(CheckpointException::getCheckpointFailureReason)
+ .map(reason -> {
+ switch (reason) {
+ case CHECKPOINT_DECLINED:
+ case CHECKPOINT_DECLINED_TASK_NOT_READY:
+ case CHECKPOINT_DECLINED_SUBSUMED:
+ case
CHECKPOINT_DECLINED_ALIGNMENT_LIMIT_EXCEEDED:
+ case
CHECKPOINT_DECLINED_INPUT_END_OF_STREAM:
Review comment:
**NB:** this check is very rough, it may be too pessimistic in a way, that
some causes not necessary leave the job in half-locked state
(`CHECKPOINT_DECLINED, CHECKPOINT_DECLINED_TASK_NOT_READY`).
* `CHECKPOINT_DECLINED_ALIGNMENT_LIMIT_EXCEEDED` case is the one I can
reproduce;
* `CHECKPOINT_DECLINED_INPUT_END_OF_STREAM` is also a potential issue, but
may be not easy to reproduce (should happen on checkpoints alignment in a join
when one branch has passed checkpoint and the second one has just ended);
* `CHECKPOINT_DECLINED_SUBSUMED` - should not happen, but left just to be
more future proof;
* `TASK_CHECKPOINT_FAILURE` - I'm not sure if this one should also be
present here.
Also, open question, what to do if exception is not a `CheckpointException`.
We expect that such causes would fail the task that originated the exception,
but I'm not sure how it would interfere with region recovery.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services