I think the description of when a side input is ready vs expired is the
trouble here.

 - You know that W is expired only when you can be sure that no main input
element could reference it.
 - You know that W is ready *even if it got no data* if the input that
would end up in W would be dropped (aka when W expires according to the
*side input* watermark)

So for your scenario, you push back the elements, that holds W from being
collected, when W expires on the side input you make it ready, you process
the elements with empty contents on the side input, then you GC the side
input.

Kenn

On Thu, Mar 8, 2018 at 4:32 PM Shen Li <[email protected]> wrote:

> Hi Lukasz,
>
> Let's explain this problem using a specific example.
>
> Say I have a main input element X, which accesses side input window W.
> When X arrives at a ParDo operator, W is not ready and not expired either.
> So, in this case, the ParDo should push back X and wait for W to become
> ready. Say, after two minutes, W is still unready but is expired due to
> advanced main input watermark. In this situation, how does Beam expect
> runners/engines to handle the pushed back value X? Discard X or throw an
> error?
>
> Thanks,
> Shen
>
> On Thu, Mar 8, 2018 at 6:35 PM, Lukasz Cwik <[email protected]> wrote:
>
>> I believe your missing over this point: "and also to not expire the side
>> input till the main input watermark advances beyond the garbage collection
>> hold of the side input."
>>
>> On Thu, Mar 8, 2018 at 3:33 PM, Shen Li <[email protected]> wrote:
>>
>>> Hi Lukasz,
>>>
>>> Thanks again.
>>>
>>> >  the runner is required to hold back the main input till the side
>>> input is ready
>>>
>>> Yes, I understand these requirements. But what if the side input expires
>>> before it becomes ready?
>>>
>>> Shen
>>>
>>>
>>
>

Reply via email to