[ https://issues.apache.org/jira/browse/FLINK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351585#comment-17351585 ]
John Phelan commented on FLINK-21439: ------------------------------------- {{FailureHandlingResult}} is a convenient place because it has all the information? It is not convenient for {{AdaptiveScheduler}}. It seems that implementing error handling for {{AdaptiveScheduler}} needs to either introduce a {{FailureHandlingResult}} prematurely or duplicate a lot of code that does introspection of execution graphs etc. I think adding a {{FailureHandlingResult}} is an explicit non goal for {{AdaptiveScheduler}}'s first implementation if I remember correctly. Copying a lot of code into {{AdaptiveScheduler}} doesn't seem ideal either. Maybe really there is a shared basis of Execution information that could live in some refactored space, like an {{ExecutionGraphUtils}} or in static methods in some {{ExecutionGraph}} related class. *I think this would be the best approach.* Maybe if such a refactoring makes sense, this shared refactored Execution information class would be good to discuss with your broader team? > Adaptive Scheduler: Add support for exception history > ----------------------------------------------------- > > Key: FLINK-21439 > URL: https://issues.apache.org/jira/browse/FLINK-21439 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination > Affects Versions: 1.13.0 > Reporter: Matthias > Assignee: John Phelan > Priority: Major > Labels: pull-request-available, reactive > Time Spent: 3h > Remaining Estimate: 0h > > {{SchedulerNG.requestJob}} returns an {{ExecutionGraphInfo}} that was > introduced in FLINK-21188. This {{ExecutionGraphInfo}} holds the information > about the {{ArchivedExecutionGraph}} and exception history information. > Currently, it's a list of {{ErrorInfos}}. This might change due to ongoing > work in FLINK-21190 where we might introduced a wrapper class with more > information on the failure. > The goal of this ticket is to implement the exception history for the > {{AdaptiveScheduler}}, i.e. collecting the exceptions that caused restarts. > This collection of failures should be forwarded through > {{SchedulerNG.requestJob}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)