Hey Yun, Thanks for your quick response. Much appreciated. I have replied to your answer on SO and I will continue with my doubts over there.
Thanks, Sid On Fri, Jan 7, 2022 at 9:05 PM Yun Gao <yungao...@aliyun.com> wrote: > Hi Siddhesh, > > I answered on the stackoverflow and I also copied the answers here for > reference: > > For the producer side, Flink Kafka Consumer would bookkeeper the current > offset in the > > distributed checkpoint, and if the consumer task failed, it will restarted > from the latest > > checkpoint and re-emit from the offset recorded in the checkpoint. For > example, suppose > > the latest checkpoint records offset 3, and after that flink continue to > emit 4, 5 and then > > failover, then Flink would continue to emit records from 4. Notes that > this would not cause > > duplication since the state of all the operators are also fallback to the > state after processed > > records 3. > > > For the producer side, Flink use two-phase commit [1] to achieve > exactly-once. Roughly > > Flink Producer would relies on Kafka's transaction to write data, and only > commit data > > formally after the transaction is committed. Users could use > Semantics.EXACTLY_ONCE > > to enable this functionality. > > > We are warmly welcome for reaching to the community for help and very > thanks > > everyone for participating in the community :) I think David and Martijn > are also try to > > make we work together more efficiently. Very thanks for the understandings~ > > > Best, > > Yun > > > [1] > https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html > > [2] > https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/kafka/#fault-tolerance > > > > ------------------------------------------------------------------ > From:Siddhesh Kalgaonkar <kalgaonkarsiddh...@gmail.com> > Send Time:2022 Jan. 7 (Fri.) 23:25 > To:Martijn Visser <mart...@ververica.com> > Cc:"David Morávek" <d...@apache.org>; user <user@flink.apache.org> > Subject:Re: Exactly Once Semantics > > Hi Martijn, > > Understood. If possible please help me out with the problem. > > Thanks, > Sid > > On Fri, Jan 7, 2022 at 8:45 PM Martijn Visser <mart...@ververica.com> > wrote: > Hi Siddesh, > > The purpose of both Stackoverflow and the mailing list is to solve a > question or a problem, the mailing list is not for getting attention. It > equivalents crossposting, which we rather don't. As David mentioned, time > is limited and we all try to spent it the best we can. > > Best regards, > > Martijn > > Op vr 7 jan. 2022 om 16:04 schreef Siddhesh Kalgaonkar < > kalgaonkarsiddh...@gmail.com> > Hi David, > > It's actually better in my opinion. Because people who are not aware of > the ML thread can Google and check the SO posts when they come across any > similar problems. The reason behind posting on ML is to get attention. > Because few questions are unanswered for multiple days and since we are > beginners, the only things which we have are SO and ML. I won't say > "Duplication" but more kind of "Availability of similar problems". > > It's okay if you don't want to help. > > Cheers! > > Sid > > On Fri, Jan 7, 2022 at 8:18 PM David Morávek <d...@apache.org> wrote: > Hi Siddhesh, > > can you please focus your questions on one channel only? (either SO or the > ML) > > this could lead to unnecessary work duplication (which would be shame, > because the community has limited resources) as people answering on SO > might not be aware of the ML thread > > D. > > On Fri, Jan 7, 2022 at 3:02 PM Siddhesh Kalgaonkar < > kalgaonkarsiddh...@gmail.com> wrote: > I am trying to achieve exactly one semantics using Flink and Kafka. I have > explained my scenario thoroughly in this post > > https://stackoverflow.com/questions/70622321/exactly-once-in-flink-kafka-producer-and-consumer > > Any help is much appreciated! > > Thanks, > Sid > -- > > Martijn Visser | Product Manager > > mart...@ververica.com > > <https://www.ververica.com/> > > > Follow us @VervericaData > > -- > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink > Conference > > Stream Processing | Event Driven | Real Time > > >