[jira] [Commented] (FLINK-11159) Allow configuration whether to fall back to savepoints for restore

vinoyang (JIRA) Fri, 18 Jan 2019 07:36:10 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-11159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16746405#comment-16746405
 ]


vinoyang commented on FLINK-11159:
----------------------------------

If the user enables this option, we can think of it as a "dynamic (not 
periodic) checkpoint". It enables the “pause/resume” function in a faster and 
more efficient way. Of course, the role that savepoint itself has (such as 
upgraded versions, etc.) still exists. I think we really need this feature. If 
you agree, I am willing to provide a design document for this? What do you 
think about the idea? cc [~till.rohrmann] [~Zentol]

> Allow configuration whether to fall back to savepoints for restore
> ------------------------------------------------------------------
>
>                 Key: FLINK-11159
>                 URL: https://issues.apache.org/jira/browse/FLINK-11159
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.5.5, 1.6.2, 1.7.0
>            Reporter: Nico Kruber
>            Assignee: vinoyang
>            Priority: Major
>
> Ever since FLINK-3397, upon failure, Flink would restart from the latest 
> checkpoint/savepoint which ever is more recent. With the introduction of 
> local recovery and the knowledge that a RocksDB checkpoint restore would just 
> copy the files, it may be time to re-consider / making this configurable:
> In certain situations, it may be faster to restore from the latest checkpoint 
> only (even if there is a more recent savepoint) and reprocess the data 
> between. On the downside, though, that may not be correct because that might 
> break side effects if the savepoint was the latest one, e.g. consider this 
> chain: {{chk1 -> chk2 -> sp … restore chk2 -> …}}. Then all side effects 
> between {{chk2 -> sp}} would be reproduced.
> Making this configurable will allow the user to set whatever he needs / can 
> to get the lowest recovery time in Flink.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-11159) Allow configuration whether to fall back to savepoints for restore

Reply via email to