zhijiangW commented on a change in pull request #12478:
URL: https://github.com/apache/flink/pull/12478#discussion_r437897063



##########
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/SubtaskCheckpointCoordinatorImpl.java
##########
@@ -186,7 +186,7 @@ public void abortCheckpointOnBarrier(long checkpointId, 
Throwable cause, Operato
 
                checkpointStorage.clearCacheFor(checkpointId);
 
-               channelStateWriter.abort(checkpointId, cause);
+               channelStateWriter.abort(checkpointId, cause, true);

Review comment:
       I am not quite sure whether we still have race condition here. 
`abortCheckpointOnBarrier` might be called from `CheckpointBarrierUnaligner` 
after triggering checkpoint into mailbox. After aborting, we did not remove the 
checkpoint action from mailbox, so the checkpoint might still happen 
afterwards. So how to guarantee that `#getAndRemoveWriteResult` would never be 
called after aborting? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to