StefanRRichter commented on a change in pull request #8745: [FLINK-11662]
Disable task to fail on checkpoint errors
URL: https://github.com/apache/flink/pull/8745#discussion_r297719324
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointFailureManager.java
##########
@@ -31,18 +31,24 @@
*/
public class CheckpointFailureManager {
- private final static int UNLIMITED_TOLERABLE_FAILURE_NUMBER =
Integer.MAX_VALUE;
+ public final static int UNLIMITED_TOLERABLE_FAILURE_NUMBER =
Integer.MAX_VALUE;
+ public final static int UNDEFINED_TOLERABLE_CHECKPOINT_NUMBER = -1;
private final int tolerableCpFailureNumber;
private final FailJobCallback failureCallback;
private final AtomicInteger continuousFailureCounter;
private final Set<Long> countedCheckpointIds;
public CheckpointFailureManager(int tolerableCpFailureNumber,
FailJobCallback failureCallback) {
- checkArgument(tolerableCpFailureNumber >= 0,
+ checkArgument(tolerableCpFailureNumber >= 0 ||
tolerableCpFailureNumber == UNDEFINED_TOLERABLE_CHECKPOINT_NUMBER,
"The tolerable checkpoint failure number is illegal, " +
"it must be greater than or equal to 0 .");
- this.tolerableCpFailureNumber = tolerableCpFailureNumber;
+ // set tolerableCpFailureNumber as 0 when passing
UNDEFINED_TOLERABLE_CHECKPOINT_NUMBER.
+ if (tolerableCpFailureNumber ==
UNDEFINED_TOLERABLE_CHECKPOINT_NUMBER) {
Review comment:
Yes, your suggestion sounds good and I am also not against this in
`CheckpointConfig`, I just think `CheckpointFailureManager` constructor is a
bit late.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services