We are receiving events from a no. of independent data sources and hence,
data arriving into our Flink topology (via Kafka) would be out of order.
We are creating 1-min event time windows in our Flink topology and
generating event time watermarks as (current event time - some threshold
(30 seconds)) at the source operator.
In case a few events arrive after the set threshold, those events are
simply ignored (which is ok in our case, because most of the events
belonging to that minute would have already arrived and got processed in
the corresponding window).
Now, the problem is that in case the program crashes (for whatever reason)
and is then resumed again from the last successful checkpoint, out of order
arriving events would trigger execution of past (already processed) windows
(with only a minuscule of events in that window) overriding results of
prev. computation of that window.
In case Flink had checkpointed event time watermarks, this problem would
not have occurred.
So, I am wondering if there is a way to enforce event time watermarks'
checkpointing in Flink...
Software Development Engineer
On Tue, Feb 27, 2018 at 6:54 PM, Xingcan Cui <xingc...@gmail.com> wrote:
> Hi Vijay,
> normally, maybe there’s no need to checkpoint the event times / watermarks
> since they are automatically generated based on the records. What’s your
> > On 27 Feb 2018, at 8:50 PM, vijay kansal <vijaykans...@gmail.com> wrote:
> > Hi All
> > Is there a way to checkpoint event time watermarks in Flink ?
> > I tries searching for this, but could not figure out...
> > Vijay Kansal
> > Software Development Engineer
> > LimeRoad