[ https://issues.apache.org/jira/browse/FLINK-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15355534#comment-15355534 ]
ramkrishna.s.vasudevan commented on FLINK-3397: ----------------------------------------------- [~uce] Any feedback here. Is this going to be a simple logical change in the CheckPointcoordinator#restoreLatestCheckpointedState such that we check the checkPointID from the save point and the checkPointID from the checkpoint coordinator see which one is latest and then go ahead with the latest as the restoration point? Or are you seeing some greater design change wrt savapoints and checkpoints are handled? > Failed streaming jobs should fall back to the most recent checkpoint/savepoint > ------------------------------------------------------------------------------ > > Key: FLINK-3397 > URL: https://issues.apache.org/jira/browse/FLINK-3397 > Project: Flink > Issue Type: Improvement > Components: Streaming > Affects Versions: 1.0.0 > Reporter: Gyula Fora > Priority: Minor > > The current fallback behaviour in case of a streaming job failure is slightly > counterintuitive: > If a job fails it will fall back to the most recent checkpoint (if any) even > if there were more recent savepoint taken. This means that savepoints are not > regarded as checkpoints by the system only points from where a job can be > manually restarted. > I suggest to change this so that savepoints are also regarded as checkpoints > in case of a failure and they will also be used to automatically restore the > streaming job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)