Hi Henry,

Yes, exactly once using atomic way is heavy for mysql. However, you don't
have to buffer data if you choose option 2. You can simply overwrite old
records with new ones if result data is idempotent and this way can also
achieve exactly once.
There is a document about End-to-End Exactly-Once Processing in Apache
Flink[1], which may be helpful for you.

Best, Hequn

[1]
https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html



On Fri, Oct 12, 2018 at 5:21 PM 徐涛 <happydexu...@gmail.com> wrote:

> Hi
>         I am reading the book “Introduction to Apache Flink”, and in the
> book there mentions two ways to achieve sink exactly once:
>         1. The first way is to buffer all output at the sink and commit
> this atomically when the sink receives a checkpoint record.
>         2. The second way is to eagerly write data to the output, keeping
> in mind that some of this data might be “dirty” and replayed after a
> failure. If there is a failure, then we need to roll back the output, thus
> overwriting the dirty data and effectively deleting dirty data that has
> already been written to the output.
>
>         I read the code of Elasticsearch sink, and find there is a
> flushOnCheckpoint option, if set to true, the change will accumulate until
> checkpoint is made. I guess it will guarantee at-least-once delivery,
> because although it use batch flush, but the flush is not a atomic action,
> so it can not guarantee exactly-once delivery.
>
>         My question is :
>         1. As many sinks do not support transaction, at this case I have
> to choose 2 to achieve exactly once. At this case, I have to buffer all the
> records between checkpoints and delete them, it is a bit heavy action.
>         2. I guess mysql sink should support exactly once delivery,
> because it supports transaction, but at this case I have to execute batch
> according to the number of actions between checkpoints but not a specific
> number, 100 for example. When there are a lot of items between checkpoints,
> it is a heavy action either.
>
> Best
> Henry

Reply via email to