Fault tolerance & idempotency on window functions

Kamil Dziublinski Tue, 25 Apr 2017 10:54:55 -0700

Hi guys,

I have a flink streaming job that reads from kafka, creates some statistics
increments and stores this in hbase (using normal puts).
I'm using fold function here of with window of few seconds.


My tests showed me that restoring state with window functions is not
exactly working how I expected.
I thought that if my window functions emits an aggregated object to a sink,
and that object fails in a sink, this write to hbase will be replayed. So
even if it actually got written to HBase, but flink thought it didnt (for
instance during network problem) I could be sure of idempotent writes. I
wanted to enforce that by using the timestamp of the first event used in
that window for aggregation.

Now correct me if I'm wrong but it seems that in the case of failure (even
if its in sink) whole flow is getting replayed from last checkpoint which
means that my window function might evict aggregated object in a different
form. For instance not only having tuples that failed but also other ones,
which would break my idempotency her and I might end up with having higher
counters than I should have.

Do you have any suggestion on how to solve/workaround such problem in flink?

Thanks,
Kamil.

Fault tolerance & idempotency on window functions

Reply via email to