zhijiangW commented on a change in pull request #11509:
URL: https://github.com/apache/flink/pull/11509#discussion_r495564020



##########
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/AsyncCheckpointRunnable.java
##########
@@ -197,7 +199,8 @@ private void handleExecutionException(Exception e) {
                                // We only report the exception for the 
original cause of fail and cleanup.
                                // Otherwise this followup exception could race 
the original exception in failing the task.
                                try {
-                                       
taskEnvironment.declineCheckpoint(checkpointMetaData.getCheckpointId(), 
checkpointException);
+                                       
taskEnvironment.declineCheckpoint(checkpointMetaData.getCheckpointId(),
+                                                       new 
CheckpointException(CheckpointFailureReason.EXCEPTION, checkpointException));

Review comment:
       I have two concerns for this change:
   
   - I saw the motivation from the respective jira description that it wants to 
resolve the misleading `CheckpointFailureReason.JOB_FAILURE`. But when I traced 
the codes, I found the `JOB_FAILURE` was used in the below region of 
`CheckpointCoordinator#receiveDeclineMessage`:
   
   ```
   final CheckpointException checkpointException;
   if (message.getReason() == null) {
                checkpointException = new 
CheckpointException(CheckpointFailureReason.CHECKPOINT_DECLINED);
   } else {
                checkpointException = 
getCheckpointException(CheckpointFailureReason.JOB_FAILURE, 
message.getReason());
   }
   ```
   
   So I guess whatever which exception is wrapped here, it would be all 
regarded as `JOB_FAILURE` unless `null` reason. Or I misunderstood something 
else?
   
   - The semantic seems not correct when we use 
`CheckpointFailureReason.EXCEPTION` here, because 
`CheckpointFailureReason#preFlight` will be true for this exception. But we 
know that if the exception happens during async checkpoint process, the 
semantic of `#preFlight` should be false.
   
   Another tiny comment for the formatting is to either make every argument in 
separate line or both in the same line. :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to