[jira] [Updated] (FLINK-3397) Failed streaming jobs should fall back to the most recent checkpoint/savepoint

ramkrishna.s.vasudevan (JIRA) Wed, 06 Jul 2016 03:27:28 -0700

     [ 
https://issues.apache.org/jira/browse/FLINK-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


ramkrishna.s.vasudevan updated FLINK-3397:
------------------------------------------
    Attachment: FLINK-3397.pdf

Just adding a doc to highlight how things work now. If am missing something or 
am wrong please do correct me. Also the proposal or solution am not sure what 
[~uce] has in mind. Based on that I can create tasks and work on them. Valuable 
feedback/comments are welcome and pardon for any misleading or wrong info. I 
can update the doc based on the discussion here.

> Failed streaming jobs should fall back to the most recent checkpoint/savepoint
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-3397
>                 URL: https://issues.apache.org/jira/browse/FLINK-3397
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing, Streaming
>    Affects Versions: 1.0.0
>            Reporter: Gyula Fora
>            Priority: Minor
>         Attachments: FLINK-3397.pdf
>
>
> The current fallback behaviour in case of a streaming job failure is slightly 
> counterintuitive:
> If a job fails it will fall back to the most recent checkpoint (if any) even 
> if there were more recent savepoint taken. This means that savepoints are not 
> regarded as checkpoints by the system only points from where a job can be 
> manually restarted.
> I suggest to change this so that savepoints are also regarded as checkpoints 
> in case of a failure and they will also be used to automatically restore the 
> streaming job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (FLINK-3397) Failed streaming jobs should fall back to the most recent checkpoint/savepoint

Reply via email to