[
https://issues.apache.org/jira/browse/FLINK-6472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16002961#comment-16002961
]
Elias Levy commented on FLINK-6472:
-----------------------------------
Ideally Flink would make available an abstract class similar to
{{BoundedOutOfOrdernessTimestampExtractor}} that would bound the watermark in
event time rather than processing time, thus ensuring even watermark output in
event time, but that would also take a parameter that would bound in processing
time how long it will wait before generating a watermark, ensuring a watermark
is output if there is a lull in messages to trigger the event time driven
watermark.
Alternatively, if the extractor could implement both
{{AssignerWithPeriodicWatermarks}} and {{AssignerWithPunctuatedWatermarks}}
then the user could implement that logic himself.
> BoundedOutOfOrdernessTimestampExtractor does not bound out of orderliness
> -------------------------------------------------------------------------
>
> Key: FLINK-6472
> URL: https://issues.apache.org/jira/browse/FLINK-6472
> Project: Flink
> Issue Type: Bug
> Components: DataStream API
> Affects Versions: 1.3.0
> Reporter: Elias Levy
>
> {{BoundedOutOfOrdernessTimestampExtractor}} attempts to emit watermarks that
> lag behind the largest observed timestamp by a configurable time delta. It
> fails to so in some circumstances.
> The class extends {{AssignerWithPeriodicWatermarks}}, which generates
> watermarks in periodic intervals. The timer for this intervals is a
> processing time timer.
> In circumstances where there is a rush of events (restarting Flink, unpausing
> an upstream producer, loading events from a file, etc), many events with
> timestamps much larger that what the configured bound would normally allow
> will be sent downstream without a watermark. This can have negative effects
> downstream, as operators may be buffering the events waiting for a watermark
> to process them, thus leading the memory growth and possible out-of-memory
> conditions.
> It is probably best to have a bounded out of orderliness extractor that is
> based on the punctuated timestamp extractor, so we can ensure that watermarks
> are generated in a timely fashion in event time, with the addition of process
> time timer to generate a watermark if there has been a lull in events, thus
> also bounding the delay of generating a watermark in processing time.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)