On Thu, Apr 28, 2016 at 10:19 AM, Aljoscha Krettek <aljos...@apache.org>
wrote:

> No worries :-) and thanks for the detailed answers!
>
> I still have one question, though: you wrote that "The side input is
> considered ready when there has been any data output/added to the
> PCollection that it is being read as a side input. So the upstream trigger
> controls this." How does this work with side inputs that consist of
> multiple elements, i.e. ListPCollectionView and MapPCollectionView. For
> them, do we also consider the side input as ready once the first element
> arrives? That's why I was wondering about the triggers being responsible
> for deciding when a side input is ready.
>

Yes, just as you describe. The side input window becomes ready once it has
any data. So, combining your items 2.5 and 3, you have a situation where
main input elements may be combined with only a speculative subset of the
side input data. They will not be reprocessed once more up-to-date side
input values become known. Beyond this initial period of waiting for the
very first firing of the side input window, there are no consistency
restrictions/guarantees on main input vs side input windows or triggerings.
It may be that for a given runner updating the side input with the new
value happens at high latency so all the main input elements are processed
and gone before the update goes through. It is a bit of a dangerous area
for users. I'm pretty interested in ideas in this space.

Kenn

Reply via email to