StephanEwen commented on pull request #13784: URL: https://github.com/apache/flink/pull/13784#issuecomment-716688805
You are right, having to break code between `snapshotState()` and `onCheckpoint(checkpointId)` is not so nice in the end. Having two interfaces `Checkpointable` and `CheckpointListener` starts to get complex for users. Also, it sounds like this implies that the default reader interface does not assume to store any state in checkpoints, which is probably almost never the case (all sources need to track progress). For my initial suggestion, the thought was that quasi all sources produce state snapshots (to store progress information), so that is part of the core API. But knowing about checkpoint ID and whether the checkpoint completed is only necessary for sources that need to communicate back to the external source system (commit offsets, ack messages, etc.) and that is much more rarely the case, so it could be factored out into separate interface. Considering all these points, I am leaning towards simply adding the checkpointId to the `snapshotState()` method. Most cases would just ignore it, but it wouldn't hurt too much (other than users wondering whether that information is important for their implementation in some way). Then we we think whether we want `onCheckpointComplete(checkpointId)` in the core interface, or in a separate one (like `CheckpointListener`). A similar interface exists already, btw, we might just move it to `flink-core`: https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/CheckpointListener.java ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
