Hello,

I am trying to figure out if Apache Beam is the right framework for my use 
case. I have an unbounded stream, and there are a number of questions I would 
like to ask regarding the records in the stream:

- For example, one question may be trying to find a record with attribute A 
followed within no more than a minute by a record with attribute B followed 
within no more than 5 minutes by a record with attribute C.
- Another question may be trying to find a sequence of at least N records with 
attribute X within 5 hours of each other, followed by an record with attribute 
Y no more than an hour later.
- A third question would be seeing if there exist a record with attribute K 
*not* followed by a record with attribute L in the next 10 minutes.

Every time I encounter the pattern of records I’m looking for, I would like to 
perform an action. If I understand the Beam model correctly, each question 
would correspond to a separate pipeline I would create, and it also sounds like 
I’m looking for session windows. However, I’m assuming I would need to tee the 
input source to all the separate pipelines? I have tried to look for 
documentation and/or examples on whether Apache Beam can be used to express 
such a setup and how to do it if so, but I haven’t been able to find anything 
concrete. Any help would be appreciated.

Thanks!

Ray


Reply via email to