There is no way to avoid reprocessing tuples if the goal is to achieve exactly-once results. Please have a look at http://apex.apache.org/docs.html - presentation about fault tolerance and blog "End-to-end exactly-once".
Platform alone cannot guarantee exactly-once results. Operators mutating state in external systems need to play ball and many of the frequently used connectors in the Apex library do that. The operator processing mode "EXACTLY_ONCE" should be deprecated. It leads to unnecessary checkpointing overhead and cannot guarantee exactly once results. Thomas On Sat, Jun 17, 2017 at 2:21 PM, Vivek Bhide <bhide.vi...@gmail.com> wrote: > After any of the operators fails during processing, it always recovers from > the last checkpointed state. So it will reprocess all the tuples which were > processed before failure but not checkpointed. What is a recommended way to > to avoid this from happening? Is there any setting in Apex that enables the > checkpoint creation just before the operator completely gets killed or > fails? If not then how can it be achieved? > > I tried tweaking the operator processing mode to EXACTLY_ONCE. Also checked > details about CountStoreOperator but to make sure I cover all the > individual > operator failures, I will have to put CountStoreOperator after every > operator. Not sure if this is really scalable solution. > > What is the best recommended way to achieve this? > > > > -- > View this message in context: http://apache-apex-users-list. > 78494.x6.nabble.com/What-is-recommended-way-to-achieve- > exactly-once-tuple-processing-in-case-of-operator-failure- > scenario-tp1740.html > Sent from the Apache Apex Users list mailing list archive at Nabble.com. >