Great. Should be easy to sort this out before Go has to make any decisions.

I will take this opportunity to get on my soapbox and suggest instead of
"custom WindowFn" we simply call them "WindowFn". The suffix "Fn" indicates
that it is definable code, not just an enum that selects baked-in
functionality. If you can't run user code for a particular type of Fn, you
don't support it. If you don't support "custom WindowFns" you don't support
WindowFns (but you may support "windowing" in some predefined ways).

Kenn

On Thu, Feb 18, 2021 at 10:13 AM Robert Burke <rob...@frantil.com> wrote:

> A bit more information: graphx/translate.go has the handling of
> WindowingStrategy at pipeline encoding and we only use Non Merging.
>
> Presumably this is something that would need to be fixed when supporting
> Session windows in BEAM-4152
>
>
> On Thu, Feb 18, 2021, 10:02 AM Robert Burke <rob...@frantil.com> wrote:
>
>> Go has very basic windowing support that is managed entirely by the
>> runner. Session windowing isn't implemented yet, let alone custom windowfns
>> which i asume is what would need to specify these things.
>>
>> Session windowing is tracked in BEAM-4152
>> and Custome windowFns are tracked in BEAM-11100.
>>
>> On Wed, Feb 17, 2021, 12:06 PM Kenneth Knowles <k...@apache.org> wrote:
>>
>>> Hi all,
>>>
>>> Yet another exciting corner of portability, discovered during debugging.
>>> Some discussion at https://github.com/apache/beam/pull/14001 and
>>> https://github.com/apache/beam/pull/13998
>>>
>>> **In Java since around the beginning of Beam**
>>> When a merging WindowFn goes through a GBK/Combine and windows are
>>> merged, the downstream windowing is changed to "InvalidWindows" which will
>>> fail any downstream GBK. The user is required to re-window before another
>>> GBK.
>>>
>>> It was to protect a user from this:
>>>
>>> 1. User sets keys and chooses session windowing
>>> 2. User groups/combines by session
>>> 3. User computes the outputs to produce some new keys
>>> 4. User groups again
>>>
>>> The result usually does not make sense. Because it was forbidden we
>>> never decided whether things should merge again or not.
>>>
>>> **In protos**
>>> The MergeStatus enum has NON_MERGING, NEEDS_MERGE, and ALREADY_MERGED.
>>> It is documented that ALREADY_MERGED is for sessions/merging windows after
>>> a GBK.
>>>
>>> This is _maybe_ better. It allows the windows to just be carried along.
>>> It is a major model change and would require SDK support. But it might
>>> still not make sense because the chances that two elements have exactly the
>>> same merging window are very low for something like sessions. It may be
>>> useful for advanced tricks with merging windows, but noone is doing that
>>> because no SDK supports it.
>>>
>>> **In Python**
>>> I think nothing is done. The next GBK will merge again. I could be wrong
>>> - I just read the code very quickly and don't know it that well.
>>>
>>> **In Go**
>>> I didn't even check. Maybe someone can add the status to the thread.
>>>
>>> Kenn
>>>
>>

Reply via email to