Batch processing stream-stream joins

rick bolkey Sat, 02 Jan 2016 19:01:04 -0800

Hi all,

I'm looking for advice in how to set up a samza job that does a
stream-stream join in a way that the code can be re-used in both streaming
and batch (by re-hydrating our kafka queue with historical data).


It seems we need a way to inform our job that there is "no more data" via a
end of stream message. It also seems like any windowing aggregation that
assumes there is no more data while streaming would need to be disabled. I
wasn't able to find much discussion on the topic, so looking for some
pointers.

Thanks
Rick

Batch processing stream-stream joins

Reply via email to