[jira] [Commented] (FLINK-25586) ExecutionGraphInfoStore in session cluster should split failed and successful jobs

Shammon (Jira) Sun, 09 Jan 2022 23:16:05 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-25586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17471704#comment-17471704
 ]


Shammon commented on FLINK-25586:
---------------------------------

[~Zhanghao Chen] Would you like to pick up this issue? 

> ExecutionGraphInfoStore in session cluster should split failed and successful 
> jobs
> ----------------------------------------------------------------------------------
>
>                 Key: FLINK-25586
>                 URL: https://issues.apache.org/jira/browse/FLINK-25586
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>    Affects Versions: 1.12.7, 1.13.5, 1.14.2
>            Reporter: Shammon
>            Priority: Major
>
> In flink session cluster, jobs are stored in `FileExecutionGraphInfoStore`. 
> When the count of jobs in it reaches `jobstore.cache-size` or the live time 
> of jobs reaches `jobstore.expiration-time`, the specify jobs will be removed. 
> We can't holds too many jobs for performance reason, but we should hold 
> failed jobs for longer time to trace the cause of failure. So it's better to 
> split failed and successful jobs in `FileExecutionGraphInfoStore` and support 
> independent max-capacity for them.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-25586) ExecutionGraphInfoStore in session cluster should split failed and successful jobs

Reply via email to