[
https://issues.apache.org/jira/browse/FLINK-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16380140#comment-16380140
]
ASF GitHub Bot commented on FLINK-8777:
---------------------------------------
Github user StefanRRichter commented on the issue:
https://github.com/apache/flink/pull/5578
Ok, I took another look at the complete picture and from the test got the
feeling that retrieval and pruning should be two separated concerns and that
not only should we have two internal methods, but maybe also expose them as
different methods. For the sake to keep this short, I made a proposal in this
branch:
https://github.com/StefanRRichter/flink/tree/improve_resource_release_for_local_recovery
If you like the change, I would squash it and commit under your name,
because you did all of the important parts. What do you think?
> improve resource release when recovery from failover
> ----------------------------------------------------
>
> Key: FLINK-8777
> URL: https://issues.apache.org/jira/browse/FLINK-8777
> Project: Flink
> Issue Type: Improvement
> Components: State Backends, Checkpointing
> Affects Versions: 1.5.0
> Reporter: Sihua Zhou
> Assignee: Sihua Zhou
> Priority: Major
> Fix For: 1.5.0
>
>
> When recovery from failed, {{TaskLocalStateStoreImpl.retrieveLocalState()}}
> will be invoked, we can release all entry from
> {{storedTaskStateByCheckpointID}} that does not satisfy {{entry.checkpointID
> == checkpointID}}, this can prevent the resource leak when job loop in
> {{local checkpoint completed => failed => local checkpoint completed =>
> failed ...}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)