StefanRRichter commented on a change in pull request #7571: [FLINK-10724]
Refactor failure handling in check point coordinator
URL: https://github.com/apache/flink/pull/7571#discussion_r276134032
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/PendingCheckpoint.java
##########
@@ -101,7 +100,7 @@
private final CheckpointStorageLocation targetLocation;
/** The promise to fulfill once the checkpoint has been completed. */
- private final CompletableFuture<CompletedCheckpoint>
onCompletionPromise;
+ private final CompletableFuture<CheckpointExecutionResult>
onCompletionPromise;
Review comment:
@yanghua I can read what was written in the design doc about the intented
purposes of those classes, but you still could not convince me that they give
any additional value. Actually, I would make the same argument against
`CheckpointTriggerResult`: why not have a return value for the happy case and a
`CheckpointTriggerException` with `CheckpointAbortReason` for the exceptional
case on `triggerCheckpoint`? Your advantages are little concrete:
- about the first: what is good in a class that combines two disjunct
outcome cases? I mean except that some of the fields must always be `null`, you
need to check boolean methods for the control flow and can trigger ISE if you
call something wrong.
- about the second: I don't know what you mean by that. As I see it, the
failure manager can later just have 3 different methods like:
`reportCheckpointSuccess`, `reportCheckpointTriggertFailure`,
`reportCheckpointExecutionFailure` - where the last two receive custom
exceptions that bring in the cause enum.
Now I give some more concrete arguments against it:
- More code, more complexity, more bugs (see, we already found some in
there).
- We need control flow with many `if`s, instead of just having the normal
control flow for the happy case and a clear, separated path for the exceptional
case. That is how failure handling is typically done in Java and Flink (I know
there are some arguments against it, but that is just how things presently
are).
- People that work with the code need to figure out what disjunct
combinations are allowed to be represented by those combined results and can
make mistakes. It also somewhat encurages use of `null`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services