azagrebin commented on a change in pull request #7571: [FLINK-10724] Refactor
failure handling in check point coordinator
URL: https://github.com/apache/flink/pull/7571#discussion_r275345705
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java
##########
@@ -666,10 +671,11 @@ else if (!props.forceCheckpoint()) {
* Receives a {@link DeclineCheckpoint} message for a pending
checkpoint.
*
* @param message Checkpoint decline from the task manager
+ * @return <code>true</code> if should fail the job
*/
- public void receiveDeclineMessage(DeclineCheckpoint message) {
+ public boolean receiveDeclineMessage(DeclineCheckpoint message) {
Review comment:
@yanghua I think @StefanRRichter is right about the steps order. The
confusion probably comes from the fact that section [T1-5] in design docs
should actually be a section [T-3] as a final step where we switch from current
TM side failure handling to CheckpointFailureManager. I understood [T1-5] more
like an explanation of the motivation for [T2]. Maybe, we could merge T2/T3
into one step, but as Stefan says, it is strange to merge T1 and T3. I would
stick to [T1-T2-T3] PR ordering.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services