[ https://issues.apache.org/jira/browse/FLINK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17349030#comment-17349030 ]
Matthias commented on FLINK-21439: ---------------------------------- I feel like we can align the failure handling implemented in {{DefaultScheduler}} with the one provided by {{AdaptiveScheduler}}. Till now, it was not necessary to pass the task information into the failure handling routine since we did a global restart in any case. That's why there is a more limited {{FailureResult}} used in {{AdaptiveScheduler}} in contrast to {{FailureHandlingResult}} utilized in {{DefaultScheduler}}. With the introduction of the exception history, we'd like to have the task information in place. Hence, it might make sense to leverage from the existing functionality of {{DefaultScheduler}}. That way we might be even able to use {{FailureHandlingResultSnapshot.create(FailureHandlingResult failureHandlingResult, Function<ExecutionVertexID, Execution>)}}. But I'm happy to listen for other proposals. > Adaptive Scheduler: Add support for exception history > ----------------------------------------------------- > > Key: FLINK-21439 > URL: https://issues.apache.org/jira/browse/FLINK-21439 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination > Affects Versions: 1.13.0 > Reporter: Matthias > Assignee: John Phelan > Priority: Major > Labels: pull-request-available, reactive > Time Spent: 3h > Remaining Estimate: 0h > > {{SchedulerNG.requestJob}} returns an {{ExecutionGraphInfo}} that was > introduced in FLINK-21188. This {{ExecutionGraphInfo}} holds the information > about the {{ArchivedExecutionGraph}} and exception history information. > Currently, it's a list of {{ErrorInfos}}. This might change due to ongoing > work in FLINK-21190 where we might introduced a wrapper class with more > information on the failure. > The goal of this ticket is to implement the exception history for the > {{AdaptiveScheduler}}, i.e. collecting the exceptions that caused restarts. > This collection of failures should be forwarded through > {{SchedulerNG.requestJob}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)