On Wed, Dec 6, 2017 at 10:44 PM, Robert Bradshaw <[email protected]>
wrote:
>
> So, to answer the original question, I don't think windowing is
> operational, as given a particular windowing there is one and only one
> right output for a given input.


I just want to highlight that this is an operational/semantic blend. The
watermark determines what is in or is not in a PCollection, and the
watermark is fundamentally operational, since it is a relation between
event time and processing time. I am in complete agreement with your
statement _for a particular watermark curve_.

Or, more aggressively, if we say that dropped elements never existed - by
definition, and according to the watermark curve - then every window is
bounded (or the pipeline is stuck) and there is one and only one right
output and it will match a batch job over the same.

This touches on what I mean by the simplicity of dropping data based on
(watermark - timestamp) instead of (watermark - window expiry). When
dropping is based on window expiry, the relationship between real-time
results and archival results is not as clear.

Kenn



> This doesn't mean that it couldn't be
> interesting to explore expressing the desired windowing on the output
> of a transformation and letting it flow up rather than expressing it
> on the input and letting it flow down. (There are technical
> implications here--an output consumed in multiple ways may need to be
> executed multiple times despite it occurring once in the graph which
> could be counter-intuitive, though even just changing the triggering
> could require this, and runners could probably often be intelligent
> about sharing work between similar-enough windowing. It also means the
> choice of windowing couldn't be inspected during construction (though
> again we may have crossed that bridge if we can't inspect the choice
> of triggering).)
>
> Triggering, on the other hand, is very operational, and intention more
> easily defined at a sink (and propagated upwards) rather than the way
> we do now, so a huge +1 to this proposal.
>
> - Robert
>

Reply via email to