Re: Watermark on keyed stream

2018-10-10 Thread Elias Levy
You are correct that watermarks are not tracked per key.  You are dealing
with events with a high degree of delay variability.  That is usually not a
good match for event time processing as implemented in Flink.

You could use event time processing and configure a very large window
allowed lateness (days in your case), but that would significantly increase
the amount of state you must track.  That may be acceptable depending on
your message volume, scale of deployment, and state and timer storage
backend (RocksDB).


On Wed, Oct 10, 2018 at 11:12 AM Nick Triller 
wrote:

> Hi everyone,
>
>
>
> it seems Flink only supports global watermarks currently which is a
> problem for my use case.
>
> Many sensors send data which might be buffered for days in upstream
> systems before arriving at the Flink job.
>
> The job keys the stream by sensor. If other sensors send values in the
> meantime, the global watermark is advanced
>
> and buffered data that arrives late is dropped.
>
>
>
> How could the issue be solved? I guess it would be possible to calculate
> the watermark manually and add it to a wrapper object,
>
> but I am not sure how to correctly implement windowing (tumbling window)
> then.
>
>
>
> Thank you in advance for any ideas.
>
>
>
> Regards,
>
> Nick
>


Watermark on keyed stream

2018-10-10 Thread Nick Triller
Hi everyone,

it seems Flink only supports global watermarks currently which is a problem for 
my use case.
Many sensors send data which might be buffered for days in upstream systems 
before arriving at the Flink job.
The job keys the stream by sensor. If other sensors send values in the 
meantime, the global watermark is advanced
and buffered data that arrives late is dropped.

How could the issue be solved? I guess it would be possible to calculate the 
watermark manually and add it to a wrapper object,
but I am not sure how to correctly implement windowing (tumbling window) then.

Thank you in advance for any ideas.

Regards,
Nick