[ 
https://issues.apache.org/jira/browse/FLINK-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379698#comment-16379698
 ] 

ASF GitHub Bot commented on FLINK-8777:
---------------------------------------

Github user sihuazhou commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5578#discussion_r171130767
  
    --- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/TaskLocalStateStoreImpl.java
 ---
    @@ -159,6 +166,11 @@ public TaskStateSnapshot retrieveLocalState(long 
checkpointID) {
                TaskStateSnapshot snapshot;
                synchronized (lock) {
                        snapshot = 
storedTaskStateByCheckpointID.get(checkpointID);
    +
    +                   if (retrieveWithDiscard) {
    +                           // Only the TaskStateSnapshot.checkpointID == 
checkpointID is useful, we remove the others
    --- End diff --
    
    👍 


> improve resource release when recovery from failover
> ----------------------------------------------------
>
>                 Key: FLINK-8777
>                 URL: https://issues.apache.org/jira/browse/FLINK-8777
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.5.0
>            Reporter: Sihua Zhou
>            Assignee: Sihua Zhou
>            Priority: Major
>             Fix For: 1.5.0
>
>
> When recovery from failed, {{TaskLocalStateStoreImpl.retrieveLocalState()}} 
> will be invoked, we can release all entry from 
> {{storedTaskStateByCheckpointID}}  that does not satisfy {{entry.checkpointID 
> == checkpointID}}, this can prevent the resource leak when job loop in 
> {{local checkpoint completed => failed => local checkpoint completed => 
> failed ...}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to