[jira] [Comment Edited] (FLINK-8753) Introduce Incremental savepoint

Sihua Zhou (JIRA) Mon, 26 Feb 2018 18:27:23 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377908#comment-16377908
 ]


Sihua Zhou edited comment on FLINK-8753 at 2/27/18 2:26 AM:
------------------------------------------------------------

[~StephanEwen] Thanks for your reply. Indeed, what I am trying to achieve is 
just a faster savepoint that does not need to iterate all records one by one 
(along with some condition check that make it slow for huge data). And yes what 
you are described is very close to what I wanted but I didn't use the word 
`checkpoint` is that: checkpoint doesn't guarantee to support rescaling (this 
can be found on 
[flink-doc|https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/checkpoints.html#difference-to-savepoints]
 and the comment in this PR [5490|https://github.com/apache/flink/pull/5490]), 
which is always the purpose that we trigger a savepoint. An interesting thing I 
found is that, in the implementation checkpoint also support rescaling, I 
checked that both in code and in practice ... I wonder whether the "archive 
checkpoint" that you mentioned guarantee to support rescaling? 

At bout the implementation, I think maybe this issue's title is incorrect ... I 
just want to implement the savepoint which go though the incremental checkpoint 
path but treat the `baseSstFile` as empty ( which is look like just submit the 
local RocksDB snapshot on to DFS)...


was (Author: sihuazhou):
[~StephanEwen] Thanks for your reply. Indeed, what I am trying to achieve is 
just a faster savepoint that does not  to iterate all records one by one (along 
with some condition check that make it slow for huge data). And yes what you 
are described is very close to what I wanted but I didn't use the word 
`checkpoint` is that: checkpoint doesn't guarantee to support rescaling (this 
can be found on 
[flink-doc|https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/checkpoints.html#difference-to-savepoints]
 and the comment in this PR [5490|https://github.com/apache/flink/pull/5490]), 
which is always the purpose that we trigger a savepoint. An interesting thing I 
found is that, in the implementation checkpoint also support rescaling, I 
checked that both in code and in practice ... I wonder whether the "archive 
checkpoint" guarantee to support rescaling? 

At bout the implementation, I think maybe this issue's title incorrect ... I 
just want to implement the save point which go though the incremental 
checkpoint path but treat the `baseSstFile` as empty ( which is look like just 
submit the local RocksDB snapshot on to DFS).

> Introduce Incremental savepoint
> -------------------------------
>
>                 Key: FLINK-8753
>                 URL: https://issues.apache.org/jira/browse/FLINK-8753
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.5.0
>            Reporter: Sihua Zhou
>            Assignee: Sihua Zhou
>            Priority: Major
>
> Right now, savepoint goes through the full checkpoint path, take a savepoint 
> could be slowly. In our production, for some long term job it often costs 
> more than 10min to complete a savepoint which is unacceptable for a real time 
> job, so we have to turn back to use the externalized checkpoint instead 
> currently. But the externalized  checkpoint has a time interval (checkpoint 
> interval) between the last time. So I proposal to introduce the increment 
> savepoint which goes through the increment checkpoint path.
> Any advice would be appreciated!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-8753) Introduce Incremental savepoint

Reply via email to