First of all, Thanks for the detailed explanation!

I can say that from my point of view (as a runner developer) this is
definitely confusing, especially discovering that an element in an empty
window can be dropped at anytime, so +1 for Robert's comment on not having
this public API, and according to Kenneth's lookup it looks like it's not
entangled too deep.

So I guess #valueInGlobalWindow should be the "go-to" default window (as
long as no "real" windows are involved), should we consider making this
more clear in the public API ? maybe WindowedValue<T>#defaultValue(T) ?
which will probably implement a global window.. just a thought.

On Wed, Apr 13, 2016 at 7:29 PM Robert Bradshaw <[email protected]>
wrote:

> As Thomas says, the fact that we ever produce values in "no window" is
> an implementation quirk that should probably be fixed. (IIRC, it's
> used for the output of a GBK before we've done the
> group-also-by-windows to figure out what window it really should be
> in, so "value in unknown windows" would be a better choice).
>
> If a WindowFn doesn't assign a value to any windows, the system is
> free to drop it. There are pros and cons to supporting this degenerate
> case vs. making it an error. However, this should almost certainly not
> be in the public API...
>
> - Robert
>
>
> On Wed, Apr 13, 2016 at 9:06 AM, Thomas Groh <[email protected]>
> wrote:
> > Actually, my above claim isn't as strong as it can be.
> >
> > A value in no windows is considered to not exist. Values that are not
> > assigned to any window can be dropped by a runner at *any time*. A
> WindowFn
> > *must* assign all elements to at least one window. All elements that are
> > produced by any PTransform (including Sources) must be in a window,
> > potentially the GlobalWindow.
> >
> > On Wed, Apr 13, 2016 at 8:52 AM, Thomas Groh <[email protected]> wrote:
> >
> >> Values should almost always be part of at least one window. WindowFns
> >> should place all elements in at least one window, as values that are in
> no
> >> windows will be dropped when they reach a GroupByKey.
> >>
> >> Elements in no windows, for example those created by
> >> WindowedValue.valueInEmptyWindows(T) are generally an implementation
> >> detail of a transform; for example, in the InProcessPipelineRunner, the
> KV<K,
> >> Iterable<WindowedValue<V>>> elements output by a GroupByKeyOnly are in
> >> empty windows - but by the time the element reaches the boundary of the
> >> GroupByKey, the elements are reassigned to the appropriate window(s).
> >>
> >> On Tue, Apr 12, 2016 at 11:44 PM, Amit Sela <[email protected]>
> wrote:
> >>
> >>> My instinct tells me that if a value does not belong to a specific
> window
> >>> (in time) it's a part of a global window, but if so, what's the role of
> >>> the
> >>> "empty window". When should an element be a "value in an empty window"
> ?
> >>>
> >>
> >>
>

Reply via email to