Hi,

I am new to the streaming world and trying to build a data pipeline that
would analyze a data stream every hour to output some notification. I would
love to hear you guy's advices and experiences with regarding to upgrading
and patching a streaming job.

I started out the prototype to use window to aggregate the data and do
computation. However, this leads me to wonder what if in the future the
data pipeline needs to be significantly changed. Potentially, the old data
pipeline would need to be stopped and the new one launched. While doing
that, the data in flight will disappear. Since the notification is mission
critical, missing a notification for us is not acceptable. I wonder if
people in this mailing list have run into similar situation and would love
to hear stories on how other people are addressing this concern.

I then started to think may be an alternative is to aggregate the stream
into bigquery and write a batch job that run every hour to consume it.
Would this be a better alternative instead trying to solve every problem in
streaming?

Thanks
Eric

Reply via email to