Thanks for the reply. As for every element is always associated with a window, when a element is produced due to a window trigger (e.g. the GroupByKey) what window is it associated with? The window it was produced from? Maybe the question is when is a window assigned to an element?
I'll see if I can come up with an example, Thanks, Dan. On Tue, Mar 5, 2019 at 10:47 AM Kenneth Knowles <[email protected]> wrote: > > Two pieces to this: > > 1. Every element in a PCollection is always associated with a window, and > GroupByKey (hence CombinePerKey) operates per-key-and-window (w/ window > merging). > 2. If an element is not explicitly a KV, then there is no key associated with > it. > > I'm afraid I don't have any guesses at the problem based on what you've > shared. Can you say more? > > Kenn > > On Tue, Mar 5, 2019 at 10:29 AM Daniel Debrunner <[email protected]> wrote: >> >> The windowing section of the Beam programming model guide shows a >> window defined and used in the GropyByKey transform after a ParDo. >> (section 7.1.1). >> >> However I couldn't find any documentation on how long the window >> remains in scope for subsequent transforms. >> >> I have an application with this pipeline: >> >> PCollection<KV<A,B>> -> FixedWindow<KV<A,B>> -> GroupByKey -> >> PCollection<X> -> FixedWindow<X> -> Combine<X,R>.globally -> >> PCollection<R> >> >> The idea is that the first window is aggregating by key but in the >> second window I need to combine elements across all keys. >> >> With my initial app I was seeing some runtime errors in/after the >> combine where a KV<null,R> was being seen, even though at that point >> there should be no key for the PCollection<R>. >> >> In a simpler test I can apply FixedWindow<X> -> Combine<X,R>.globally >> -> PCollection<R> to a PCollection without an upstream window and the >> combine correctly happens once. >> But then adding the keyed upstream window, the combine occurs once per >> key without any final combine across the keys. >> >> So it seems somehow the memory of the key exists even with the new >> window transform, >> >> I'm probably misunderstanding some detail of windowing, but I couldn't >> find any deeper documentation than the simple examples in the >> programming model guide. >> >> Can anyone point me in the correct direction? >> >> Thanks, >> Dan.
