You might be interested in the designs[1] from the [email protected]
e-mail thread around this topic.

1:
https://lists.apache.org/thread.html/3cfbd650a46327afc752a220b20a6081570000725c96541c21265e7b@%3Cdev.beam.apache.org%3E



On Wed, Oct 4, 2017 at 3:06 PM, Yihua Fang <[email protected]>
wrote:

> Hi,
>
> I am new to the streaming world and trying to build a data pipeline that
> would analyze a data stream every hour to output some notification. I would
> love to hear you guy's advices and experiences with regarding to upgrading
> and patching a streaming job.
>
> I started out the prototype to use window to aggregate the data and do
> computation. However, this leads me to wonder what if in the future the
> data pipeline needs to be significantly changed. Potentially, the old data
> pipeline would need to be stopped and the new one launched. While doing
> that, the data in flight will disappear. Since the notification is mission
> critical, missing a notification for us is not acceptable. I wonder if
> people in this mailing list have run into similar situation and would love
> to hear stories on how other people are addressing this concern.
>
> I then started to think may be an alternative is to aggregate the stream
> into bigquery and write a batch job that run every hour to consume it.
> Would this be a better alternative instead trying to solve every problem in
> streaming?
>
> Thanks
> Eric
>

Reply via email to