Hi all,

I've been using Apache Flink for the last few months but I'm new to
the dev community. I'd like to contribute code (and possibly more) to
the community and I've been advised a good starting point would be
suggesting improvements for those areas that I found lacking. I'll
create a separate [DISCUSS] thread for each of those (if this is
indeed the process!).

-- Problem statement --

In my use cases I've had to output data at regular (event time)
intervals, regardless of whether there's been any events flowing
through the app. For those occasions when no events flow I've been
happy to delay the emission of data for some time. This amount of time
is reasonable and still several times larger than the worse delays of
my event bus. It also meets the business requirements :)

Flink's documentation suggests marking a source as temporarily idle
for such occasions but to the best of my knowledge it will not advance
the watermark if there's no events at all flowing through the system.


-- Proposed solution --

Provide all implementations of AssignerWithPeriodicWatermarks in the
Flink project a mechanism to specify a max time delay after which the
watermark will advance if no events have been processed. The watermark
will always stay as far as the specified delay when advanced in this
way.

To achieve backward compatibility I suggest providing the
implementations of AssignerWithPeriodicWatermarks with a builder
method that'll allow to specify said max delay. Other options to
introducing this change in a non-invasive way are welcome.

I'm hoping for your suggestions/comments/questions.

Thanks,
Eduardo

Reply via email to