XComp commented on a change in pull request #14798:
URL: https://github.com/apache/flink/pull/14798#discussion_r579234673
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SchedulerBase.java
##########
@@ -635,6 +641,41 @@ public void cancel() {
return
executionGraph.getTerminationFuture().thenApply(FunctionUtils.nullFn());
}
+ protected void archiveGlobalFailure(Throwable failure) {
+ archiveGlobalFailure(failure,
executionGraph.getStatusTimestamp(JobStatus.FAILED));
+ }
+
+ protected void archiveGlobalFailure(Throwable failure, long timestamp) {
+ taskFailureHistory.add(new ErrorInfo(failure, timestamp));
+ log.debug("Archive global failure.", failure);
+ }
+
+ protected void archiveFromFailureHandlingResult(FailureHandlingResult
failureHandlingResult) {
+ final Optional<Execution> executionOptional =
+ failureHandlingResult
+ .getExecutionVertexIdOfFailedTask()
+ .map(this::getExecutionVertex)
+ .map(ExecutionVertex::getCurrentExecutionAttempt);
+
+ if (executionOptional.isPresent()) {
Review comment:
I'm hesitant to change that: IMHO, it makes the code harder to read and
easier to misbehave. The current version of the code is straight forward: Use
the `FAILED` timestamp provided the corresponding `Execution` if we have the
corresponding `ExecutionAttemptId`. Additionally, your change request might
cause unwanted behavior if we decide to introduce global failures due to some
local failure. I consider the fact that the global failure does not have a
causing `Execution` an implementation detail which we don't have to expose
here. @tillrohrmann Is that reasonable?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]