[ 
https://issues.apache.org/jira/browse/BEAM-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546539#comment-15546539
 ] 

Amit Sela commented on BEAM-696:
--------------------------------

So it runs once, on the merged window ?
That happens in the bundle level, correct ? Do bundles always behave the same ? 
in terms of #elements they hold on to ?
If not, and a side input is called on the merged windows, won't the sideInput 
value be affected from things like network or something else that may affect 
the bundle's contents ? and if the bundle always holds just one element, then 
merging never happens, correct ?

I find this a bit confusing, I think the problem here has to do with the fact 
that applying a CombineFn with SideInputs on any PCollection is problematic. 
Sessions seem to be handled as any other BoundedWindow for that matter, but 
they are not..
BTW, isn't this: 
https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/Sessions.java#L84
 saying that sideInputs are not allowed on Sessions ? Fact is that they are 
allowed, but what does that mean ?  

> Side-Inputs non-deterministic with merging main-input windows
> -------------------------------------------------------------
>
>                 Key: BEAM-696
>                 URL: https://issues.apache.org/jira/browse/BEAM-696
>             Project: Beam
>          Issue Type: Bug
>          Components: beam-model
>            Reporter: Ben Chambers
>            Assignee: Pei He
>
> Side-Inputs are non-deterministic for several reasons:
> 1. Because they depend on triggering of the side-input (this is acceptable 
> because triggers are by their nature non-deterministic).
> 2. They depend on the current state of the main-input window in order to 
> lookup the side-input. This means that with merging
> 3. Any runner optimizations that affect when the side-input is looked up may 
> cause problems with either or both of these.
> This issue focuses on #2 -- the non-determinism of side-inputs that execute 
> within a Merging WindowFn.
> Possible solution would be to defer running anything that looks up the 
> side-input until we need to extract an output, and using the main-window at 
> that point. Specifically, if the main-window is a MergingWindowFn, don't 
> execute any kind of pre-combine, instead buffer all the inputs and combine 
> later.
> This could still run into some non-determinism if there are triggers 
> controlling when we extract output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to