Myasuka commented on a change in pull request #8745: [FLINK-11662] Disable task 
to fail on checkpoint errors
URL: https://github.com/apache/flink/pull/8745#discussion_r297253916
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointFailureManager.java
 ##########
 @@ -31,18 +31,24 @@
  */
 public class CheckpointFailureManager {
 
-       private final static int UNLIMITED_TOLERABLE_FAILURE_NUMBER = 
Integer.MAX_VALUE;
+       public final static int UNLIMITED_TOLERABLE_FAILURE_NUMBER = 
Integer.MAX_VALUE;
+       public final static int UNDEFINED_TOLERABLE_CHECKPOINT_NUMBER = -1;
 
        private final int tolerableCpFailureNumber;
        private final FailJobCallback failureCallback;
        private final AtomicInteger continuousFailureCounter;
        private final Set<Long> countedCheckpointIds;
 
        public CheckpointFailureManager(int tolerableCpFailureNumber, 
FailJobCallback failureCallback) {
-               checkArgument(tolerableCpFailureNumber >= 0,
+               checkArgument(tolerableCpFailureNumber >= 0 || 
tolerableCpFailureNumber == UNDEFINED_TOLERABLE_CHECKPOINT_NUMBER,
                        "The tolerable checkpoint failure number is illegal, " +
                                "it must be greater than or equal to 0 .");
-               this.tolerableCpFailureNumber = tolerableCpFailureNumber;
+               // set tolerableCpFailureNumber as 0 when passing 
UNDEFINED_TOLERABLE_CHECKPOINT_NUMBER.
+               if (tolerableCpFailureNumber == 
UNDEFINED_TOLERABLE_CHECKPOINT_NUMBER) {
 
 Review comment:
   I agree to make `tolerableCheckpointFailureNumber` valid before 
`CheckpointFailureManager` or `CheckpointCoordinatorConfiguration`. But I still 
want to introduce `UNDEFINED_TOLERABLE_CHECKPOINT_NUMBER` at least in 
`CheckpointConfig` to judge whether `setFailOnCheckpointingErrors` would take 
effect. 
   The only left problem is when we want to call 
`CheckpointConfig#getTolerableCheckpointFailureNumber`. If that field was 
already `-1`, shall we set the `tolerableCheckpointFailureNumber` as `0` or 
just return `0` without setting that. I prefer to return `0` but leave 
`tolerableCheckpointFailureNumber` still as `-1` to avoid such side effects. 
What do you think?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to