Today we support three different processing modes for operators, "at least once", "at most once" and "exactly once" which determine tuple processing and recovery behavior when there is operator recovery from failure. The default being at least once where the tuples are replayed from the recovered checkpoint.
At least once works well for most applications. Typically applications persist the final output of processing through the DAG into various outputs like key value stores, databases or even HDFS files. In many of these cases various strategies can be employed to save the data "exactly once" in the output, such as transactions, rewinding, meta data storage, idempotent operations etc. Furthermore the exactly once processing mode, which is a checkpoint performed every window is rarely used. All this leads to confusion especially to somebody new and also makes it difficult to explain these names to less technical audience in meetups and public forums. What I am proposing is only a name change which will make this more intuitive to understand. Something simple like "repeat" for "at least once", "latest" for "at most once" and "repeat latest" for "exactly once" can do the trick. Thanks
