Thanks Kenn for the clear explanation. Very helpful. I am trying to read a
small BQ table as side input and refresh it every 24 hours or so but I
still want to main stream to be processed during that time. Is there a
better way to do this than have a 24 hour window with 1 minute triggers on
the side input? Maybe just restarting the job every 24 hour and reading the
side input on setup would be the best option.

On Tue, 5 Jan 2021 at 17:53, Kenneth Knowles <[email protected]> wrote:

> You have it basically right. However, there are a couple minor
> clarifications:
>
> 1. A particular window on the side input is not "ready" until there has
> been some element output to it (or it has expired, which will make it the
> default value). Main input elements will wait for the side input to be
> ready. If you configure triggering on the side input, then the first
> triggering will make it "ready". Of course, this means that the value you
> will read will be incomplete view of the data. If you have a 24 hour window
> with triggering set up then the value that is read will be whatever the
> most recent trigger is, but with some caching delay.
> 2. None of the "time" that you are talking about is real time. It is all
> event time so it is controlled by the side input and main input watermarks.
> Of course in streaming these are usually close to real time so yes on
> average what you describe is probably right.
>
> It sounds like you want a side input with a trigger on it, if you want to
> read it before you have all the data. This is highly nondeterministic so
> you want to be sure that you do not require exact answers on the side input.
>
> Kenn
>
> On Tue, Jan 5, 2021 at 6:56 AM Manninger, Matyas <
> [email protected]> wrote:
>
>> Dear Beam users,
>>
>> Can someone clarify me how side input works in streaming? If I use a
>> stream as a side input to my main stream, each element will be paired with
>> a side input from the according time window. does this mean that the
>> element will not be processed until the appropriate window on the side
>> input stream is closed? So if my side input is windowed into 24 hour
>> windows will my elements from the main stream be processed only every 24
>> hour? If not, then if the window is triggered for the sideinput at 12:00
>> and the input actually only arrives at 12:05 then all elements from the
>> main stream processed between 12:00 and 12:05 will be matched with an empty
>> sideinput?
>>
>> Any clarification is appreciated.
>>
>> Best regards,
>> Matyas
>>
>

Reply via email to