I see. But yes, even in the case the watermark will always be "one behind". The
logic in the extraction operator is roughly this:
1. Extract timestamp T, assign to internal StreamRecord
2. Send StreamRecord downstream
3. Extract Watermark W
4. Send Watermark downstream
(In your case T == W)
The reason is that a watermark T says that there will not be an element with a
timestamp <= T in the future. If the watermark were sent before the record then
this would violate the watermark contract, i.e. your element with timestamp T
would arrive after the watermark W. I think it's not easily possible to have a
properly defined watermark for the very first element in a stream,
> On 4. Aug 2017, at 16:43, Gwenhael Pasquiers
> <gwenhael.pasqui...@ericsson.com> wrote:
> We're using a AssignerWithPunctuatedWatermarks that extracts a timestamp from
> the data. It keeps and returns the higher timestamp it has ever seen and
> returns a new Watermark everytime the value grows.
> I know it's bad for performances, but for the moment it's not the issue, i
> want the most possibly up-to-date watermark.
> -----Original Message-----
> From: Aljoscha Krettek [mailto:aljos...@apache.org]
> Sent: vendredi 4 août 2017 12:22
> To: Gwenhael Pasquiers <gwenhael.pasqui...@ericsson.com>
> Cc: Nico Kruber <n...@data-artisans.com>; firstname.lastname@example.org
> Subject: Re: Event-time and first watermark
> How are you defining the watermark, i.e. what kind of watermark extractor are
> you using?
>> On 3. Aug 2017, at 17:45, Gwenhael Pasquiers
>> <gwenhael.pasqui...@ericsson.com> wrote:
>> We're not using a Window but a more basic ProcessFunction to handle
>> sessions. We made this choice because we have to handle (millions of)
>> sessions that can last from 10 seconds to 24 hours so we wanted to handle
>> things manually using the State class.
>> We're using the watermark as an event-time "clock" to:
>> * compute "lateness" of a message relatively to the watermark (most
>> recent message from the stream)
>> * fire timer events
>> We're using event-time instead of processing time because our stream will be
>> late and data arrive by hourly bursts.
>> Maybe we're misusing the watermark ?
>> -----Original Message-----
>> From: Nico Kruber [mailto:n...@data-artisans.com]
>> Sent: jeudi 3 août 2017 16:30
>> To: email@example.com
>> Cc: Gwenhael Pasquiers <gwenhael.pasqui...@ericsson.com>
>> Subject: Re: Event-time and first watermark
>> Hi Gwenhael,
>> "A Watermark(t) declares that event time has reached time t in that
>> stream, meaning that there should be no more elements from the stream
>> with a timestamp t’ <= t (i.e. events with timestamps older or equal
>> to the watermark)." 
>> Therefore, they should be behind the actual event with timestamp t.
>> What is it that you want to achieve in the end? What do you want to use the
>> watermark for? They are basically a means to defining when an event time
>> window ends.
>>  https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/
>> On Thursday, 3 August 2017 10:24:35 CEST Gwenhael Pasquiers wrote:
>>> From my tests it seems that the initial watermark value is
>>> Long.MIN_VALUE even though my first data passed through the timestamp
>>> extractor before arriving into my ProcessFunction. It looks like the
>>> watermark "lags" behind the data by one message.
>>> Is there a way to have a watermark more "up to date" ? Or is the only
>>> way to compute it myself into my ProcessFunction ?