subject:"\[SS\] watermark, eventTime and \"StreamExecution\: Streaming query made progress\""

Re: [SS] watermark, eventTime and "StreamExecution: Streaming query made progress"

2017-08-11 Thread Michael Armbrust

The point here is to tell you what watermark value was used when executing this batch. You don't know the new watermark until the batch is over and we don't want to do two passes over the data. In general the semantics of the watermark are designed to be conservative (i.e. just because data is

[SS] watermark, eventTime and "StreamExecution: Streaming query made progress"

2017-08-11 Thread Jacek Laskowski

Hi, I'm curious why watermark is updated the next streaming batch after it's been observed [1]? The report (from ProgressReporter/StreamExecution) does not look right to me as avg/max/min are already calculated according to the watermark [2] My recommendation would be to do the update [2] in the