Hi, I've got a couple of questions concerning the topics in the subject: 1. If an operator is getting applied on a keyed stream, do I still have to implement the CheckpointedFunction trait and define the snapshotState and initializeState methods, in order to successfully recover the state from a job failure?
2. While using a FlinkKafkaConsumer, enabling checkpointing allows exactly once semantics end to end, provided that the sink is able to guarantee the same. Do I have to set setCommitOffsetsOnCheckpoints(true)? How would someone implement exactly once semantics in a sink? 3. What are the advantages of externalized checkpoints and which are the cases where I would want to use them? 4. Let's suppose a scenario where: checkpointing is enabled every 10 seconds, I have a kafka consumer which is set to start from the latest records, a sink providing at least once semantics and a stateful keyed operator inbetween the consumer and the sink. Is it correct that, in case of task failure, happens the following? - the kafka consumer gets reverted to the latest offset (does it happen even if I don't set setCommitOffsetsOnCheckpoints(true)?) - the operator state gets reverted to the latest checkpoint - the sink is stateless so it doesn't really care about what happened - the stream restarts and probably some of the events coming to the sink have already been processed before Thank you for attention, Kind regards, Federico