Thanks for the reply.

As for every element is always associated with a window, when a
element is produced due to a window trigger (e.g. the GroupByKey) what
window is it associated with? The window it was produced from? Maybe
the question is when is a window assigned to an element?

I'll see if I can come up with an example,

Thanks,
Dan.

On Tue, Mar 5, 2019 at 10:47 AM Kenneth Knowles <[email protected]> wrote:
>
> Two pieces to this:
>
> 1. Every element in a PCollection is always associated with a window, and 
> GroupByKey (hence CombinePerKey) operates per-key-and-window (w/ window 
> merging).
> 2. If an element is not explicitly a KV, then there is no key associated with 
> it.
>
> I'm afraid I don't have any guesses at the problem based on what you've 
> shared. Can you say more?
>
> Kenn
>
> On Tue, Mar 5, 2019 at 10:29 AM Daniel Debrunner <[email protected]> wrote:
>>
>> The windowing section of the Beam programming model guide shows a
>> window defined and used in the GropyByKey transform after a ParDo.
>> (section 7.1.1).
>>
>> However I couldn't find any documentation on how long the window
>> remains in scope for subsequent transforms.
>>
>> I have an application with this pipeline:
>>
>> PCollection<KV<A,B>> -> FixedWindow<KV<A,B>> -> GroupByKey ->
>> PCollection<X> -> FixedWindow<X> -> Combine<X,R>.globally ->
>> PCollection<R>
>>
>> The idea is that the first window is aggregating by key but in the
>> second window I need to combine elements across all keys.
>>
>> With my initial app I was seeing some runtime errors in/after the
>> combine where a KV<null,R> was being seen, even though at that point
>> there should be no key for the PCollection<R>.
>>
>> In a simpler test I can apply  FixedWindow<X> -> Combine<X,R>.globally
>> -> PCollection<R> to a PCollection without an upstream window and the
>> combine correctly happens once.
>> But then adding the keyed upstream window, the combine occurs once per
>> key without any final combine across the keys.
>>
>> So it seems somehow the memory of the key exists even with the new
>> window transform,
>>
>> I'm probably misunderstanding some detail of windowing, but I couldn't
>> find any deeper documentation than the simple examples in the
>> programming model guide.
>>
>> Can anyone point me in the correct direction?
>>
>> Thanks,
>> Dan.

Reply via email to