Re: Scope of windows?

2019-04-30 Thread Robert Bradshaw
In the original version of the dataflow model, windowing was not annotated on each PCollection, rather it was inferred based on tracing up the graph to the latest WindowInto operation. This tracing logic was put in the SDK for simplicity. I agree that there is room for a variety of SDK/DSL

Re: Scope of windows?

2019-04-30 Thread Kenneth Knowles
+dev@ since this has taken a turn in that direction SDK/DSL consistency is nice. But each SDK/DSL being the best thing it can be is more important IMO. I'm including DSLs to be clear that this is a construction issue having little/nothing to do with SDK in the sense of the per-run-time

Re: Scope of windows?

2019-04-30 Thread Maximilian Michels
While it might be debatable whether "continuation triggers" are part of the model, the goal should be to provide a consistent experience across SDKs. I don't see a reason why the Java SDK would use continuation triggers while the Python SDK doesn't. This makes me think that trigger behavior

Re: Scope of windows?

2019-04-29 Thread Robert Bradshaw
I would say that the triggering done in stacked GBKs, with windowings in between, is part of the model (at least in the sense that it's not something that we'd want different SDKs to do separately.) OTOH, I'm not sure the continuation trigger should be part of the model. Much easier to either let

Re: Scope of windows?

2019-04-28 Thread Kenneth Knowles
It is accurate to say that the "continuation trigger" is not documented in the general programming guide. It shows up in the javadoc only, as far as I can tell [1]. Technically, this is accurate. It is not part of the core of Beam - each language SDK is required to explicitly specify a trigger for

Re: Scope of windows?

2019-04-28 Thread Reza Rokni
+1 I recall a fun afternoon a few years ago figuring this out ... On Mon, 11 Mar 2019 at 18:36, Maximilian Michels wrote: > Hi, > > I have seen several users including myself get confused by the "default" > triggering behavior. I think it would be worthwhile to update the docs. > > In fact,

Re: Scope of windows?

2019-03-11 Thread Maximilian Michels
Hi, I have seen several users including myself get confused by the "default" triggering behavior. I think it would be worthwhile to update the docs. In fact, Window.into(windowFn) does not override the existing windowing/triggering. It merges the previous input WindowStrategy with the new

Re: Scope of windows?

2019-03-05 Thread Daniel Debrunner
Thanks Kenn,. Is it fair to say that this continuation trigger functionality is not documented? In the Javadoc it has a similar sentence to the programming guide: > triggering(Trigger) allows specifying a trigger to control when (in > processing time) results for the given window can be

Re: Scope of windows?

2019-03-05 Thread Kenneth Knowles
The Window.into transform does not reset the trigger to the default. So where you have w2trigger, if you leave it off, then the triggering is left as the "continuation trigger" from w1trigger. Basically it tries to let any output caused by w1trigger to flow all the way through the pipeline without

Re: Scope of windows?

2019-03-05 Thread Daniel Debrunner
I discover how to fix my issue but not sure I understand why it does. I created a complete sample here: https://gist.github.com/ddebrunner/5d4ef21c255c1d40a4517a0060ff8b99#file-cascadewindows-java-L104 Link points to the area of interest. With the second window I was originally not specifying a

Re: Scope of windows?

2019-03-05 Thread Daniel Debrunner
Thanks Robert, your description is what I'm expecting, I'm working on a simple example to see if what I'm seeing is different and then hopefully use that to clarify my misunderstanding. Thanks, Dan. On Tue, Mar 5, 2019 at 11:31 AM Robert Bradshaw wrote: > > Windows are assigned to elements via

Re: Scope of windows?

2019-03-05 Thread Robert Bradshaw
Windows are assigned to elements via the Window.into transform. They influence grouping operations such as GroupByKey, Combine.perKey, and Combine.globally. Looking at your example, you start with PCollection> Presumably via a Read or a Create. These KVs are in a global window, so the

Re: Scope of windows?

2019-03-05 Thread Daniel Debrunner
Thanks for the reply. As for every element is always associated with a window, when a element is produced due to a window trigger (e.g. the GroupByKey) what window is it associated with? The window it was produced from? Maybe the question is when is a window assigned to an element? I'll see if I

Re: Scope of windows?

2019-03-05 Thread Kenneth Knowles
Two pieces to this: 1. Every element in a PCollection is always associated with a window, and GroupByKey (hence CombinePerKey) operates per-key-and-window (w/ window merging). 2. If an element is not explicitly a KV, then there is no key associated with it. I'm afraid I don't have any guesses at

Scope of windows?

2019-03-05 Thread Daniel Debrunner
The windowing section of the Beam programming model guide shows a window defined and used in the GropyByKey transform after a ParDo. (section 7.1.1). However I couldn't find any documentation on how long the window remains in scope for subsequent transforms. I have an application with this

Scope of windows?

2019-03-04 Thread Daniel Debrunner
The windowing section of the Beam programming model guide shows a window defined and used in the GropyByKey transform after a ParDo. (section 7.1.1). However I couldn't find any documentation on how long the window remains in scope for subsequent transforms. I have an application with this