[ https://issues.apache.org/jira/browse/FLINK-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ramkrishna.s.vasudevan updated FLINK-3397: ------------------------------------------ Attachment: FLINK-3397.pdf Just adding a doc to highlight how things work now. If am missing something or am wrong please do correct me. Also the proposal or solution am not sure what [~uce] has in mind. Based on that I can create tasks and work on them. Valuable feedback/comments are welcome and pardon for any misleading or wrong info. I can update the doc based on the discussion here. > Failed streaming jobs should fall back to the most recent checkpoint/savepoint > ------------------------------------------------------------------------------ > > Key: FLINK-3397 > URL: https://issues.apache.org/jira/browse/FLINK-3397 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing, Streaming > Affects Versions: 1.0.0 > Reporter: Gyula Fora > Priority: Minor > Attachments: FLINK-3397.pdf > > > The current fallback behaviour in case of a streaming job failure is slightly > counterintuitive: > If a job fails it will fall back to the most recent checkpoint (if any) even > if there were more recent savepoint taken. This means that savepoints are not > regarded as checkpoints by the system only points from where a job can be > manually restarted. > I suggest to change this so that savepoints are also regarded as checkpoints > in case of a failure and they will also be used to automatically restore the > streaming job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)