azagrebin commented on a change in pull request #7571: [FLINK-10724] Refactor
failure handling in check point coordinator
URL: https://github.com/apache/flink/pull/7571#discussion_r275345705
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java
##########
@@ -666,10 +671,11 @@ else if (!props.forceCheckpoint()) {
* Receives a {@link DeclineCheckpoint} message for a pending
checkpoint.
*
* @param message Checkpoint decline from the task manager
+ * @return <code>true</code> if should fail the job
*/
- public void receiveDeclineMessage(DeclineCheckpoint message) {
+ public boolean receiveDeclineMessage(DeclineCheckpoint message) {
Review comment:
@yanghua I think @StefanRRichter is right about the steps order. The
confusion probably comes from the fact that section [T1-5] in design doc should
actually be a section [T-3] as a final step where we switch from current TM
side failure handling to CheckpointFailureManager. I understood [T1-5] more
like an explanation of the motivation for [T2]. Maybe, we could merge T2/T3
into one step, but as Stefan says, it is strange to merge T1 and T3. I would
stick to [T1-T2-T3] PR ordering.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services