[
https://issues.apache.org/jira/browse/FLINK-11159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16725719#comment-16725719
]
Till Rohrmann commented on FLINK-11159:
---------------------------------------
Wouldn't the problem be solved if Flink would cache the savepoint locally after
having resumed from it? We need to download the savepoint files anyway.
> Allow configuration whether to fall back to savepoints for restore
> ------------------------------------------------------------------
>
> Key: FLINK-11159
> URL: https://issues.apache.org/jira/browse/FLINK-11159
> Project: Flink
> Issue Type: Improvement
> Components: State Backends, Checkpointing
> Affects Versions: 1.5.5, 1.6.2, 1.7.0
> Reporter: Nico Kruber
> Assignee: vinoyang
> Priority: Major
>
> Ever since FLINK-3397, upon failure, Flink would restart from the latest
> checkpoint/savepoint which ever is more recent. With the introduction of
> local recovery and the knowledge that a RocksDB checkpoint restore would just
> copy the files, it may be time to re-consider / making this configurable:
> In certain situations, it may be faster to restore from the latest checkpoint
> only (even if there is a more recent savepoint) and reprocess the data
> between. On the downside, though, that may not be correct because that might
> break side effects if the savepoint was the latest one, e.g. consider this
> chain: {{chk1 -> chk2 -> sp … restore chk2 -> …}}. Then all side effects
> between {{chk2 -> sp}} would be reproduced.
> Making this configurable will allow the user to set whatever he needs / can
> to get the lowest recovery time in Flink.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)