Hi, For failures recovery with Kafka 0.9 it is not possible to avoid duplicated messages. Using Flink 1.4 (unreleased yet) combined with Kafka 0.11 it will be possible to achieve exactly-once end to end semantic when writing to Kafka. However this still a work in progress:
https://issues.apache.org/jira/browse/FLINK-6988 <https://issues.apache.org/jira/browse/FLINK-6988> However this is a superset of functionality that you are asking for. Exactly-once just for clean shutdowns is also on our “TODO” list (it would/could support Kafka 0.9), but it is not currently being actively developed. Piotr Nowojski > On Oct 2, 2017, at 3:35 PM, Antoine Philippot <antoine.philip...@teads.tv> > wrote: > > Hi, > > I'm working on a flink streaming app with a kafka09 to kafka09 use case which > handles around 100k messages per seconds. > > To upgrade our application we used to run a flink cancel with savepoint > command followed by a flink run with the previous saved savepoint and the new > application fat jar as parameter. We notice that we can have more than 50k of > duplicated messages in the kafka sink wich is not idempotent. > > This behaviour is actually problematic for this project and I try to find a > solution / workaround to avoid these duplicated messages. > > The JobManager indicates clearly that the cancel call is triggered once the > savepoint is finished, but during the savepoint execution, kafka source > continue to poll new messages which will not be part of the savepoint and > will be replayed on the next application start. > > I try to find a solution with the stop command line argument but the kafka > source doesn't implement StoppableFunction > (https://issues.apache.org/jira/browse/FLINK-3404 > <https://issues.apache.org/jira/browse/FLINK-3404>) and the savepoint > generation is not available with stop in contrary to cancel. > > Is there an other solution to not process duplicated messages for each > application upgrade or rescaling ? > > If no, has someone planned to implement it? Otherwise, I can propose a pull > request after some architecture advices. > > The final goal is to stop polling source and trigger a savepoint once polling > stopped. > > Thanks