I think falling back to earlier checkpoints/savepoints is orthogonal to this and covered in https://issues.apache.org/jira/browse/FLINK-4815.
As it is currently, you can have the (quite possible) scenario where you do a savepoint and then your job fails. This leads to data duplication in the sink. I think the scenario where you do a savepoint, then delete that savepoint, and then try to recover is unlikely. And even if it does happen you can manually fix the situation by editing ZooKeeper entries (I think). There is no possible manual recovery for the above scenario (downstream systems possible have already consumed the emitted and committed data). What do you think? @tillrohrmann Is this "revert" still valid or where there changes in the meantime that could break things? [ Full content available at: https://github.com/apache/flink/pull/6704 ] This message was relayed via gitbox.apache.org for [email protected]
