[ https://issues.apache.org/jira/browse/FLINK-25586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17476743#comment-17476743 ]
Zhanghao Chen commented on FLINK-25586: --------------------------------------- [~dmvk] I think the CANCELLED state be treated as a successful execution, "successful" in a sense that the terminating state is in accordance with what the user expects for their action (cancelling the job). This is not the case for FAILED state, as no body will expect their job to end up with a FAILED state when they submit them. > ExecutionGraphInfoStore in session cluster should split failed and successful > jobs > ---------------------------------------------------------------------------------- > > Key: FLINK-25586 > URL: https://issues.apache.org/jira/browse/FLINK-25586 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination > Affects Versions: 1.12.7, 1.13.5, 1.14.2 > Reporter: Shammon > Priority: Major > > In flink session cluster, jobs are stored in `FileExecutionGraphInfoStore`. > When the count of jobs in it reaches `jobstore.cache-size` or the live time > of jobs reaches `jobstore.expiration-time`, the specify jobs will be removed. > We can't holds too many jobs for performance reason, but we should hold > failed jobs for longer time to trace the cause of failure. So it's better to > split failed and successful jobs in `FileExecutionGraphInfoStore` and support > independent max-capacity for them. -- This message was sent by Atlassian Jira (v8.20.1#820001)