[
https://issues.apache.org/jira/browse/FLINK-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ramkrishna.s.vasudevan updated FLINK-3397:
------------------------------------------
Attachment: FLINK-3397.pdf
Just adding a doc to highlight how things work now. If am missing something or
am wrong please do correct me. Also the proposal or solution am not sure what
[~uce] has in mind. Based on that I can create tasks and work on them. Valuable
feedback/comments are welcome and pardon for any misleading or wrong info. I
can update the doc based on the discussion here.
> Failed streaming jobs should fall back to the most recent checkpoint/savepoint
> ------------------------------------------------------------------------------
>
> Key: FLINK-3397
> URL: https://issues.apache.org/jira/browse/FLINK-3397
> Project: Flink
> Issue Type: Improvement
> Components: State Backends, Checkpointing, Streaming
> Affects Versions: 1.0.0
> Reporter: Gyula Fora
> Priority: Minor
> Attachments: FLINK-3397.pdf
>
>
> The current fallback behaviour in case of a streaming job failure is slightly
> counterintuitive:
> If a job fails it will fall back to the most recent checkpoint (if any) even
> if there were more recent savepoint taken. This means that savepoints are not
> regarded as checkpoints by the system only points from where a job can be
> manually restarted.
> I suggest to change this so that savepoints are also regarded as checkpoints
> in case of a failure and they will also be used to automatically restore the
> streaming job.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)