On Mon, Jun 9, 2014 at 3:48 PM, Rajiv Onat <[email protected]> wrote: > a) I have stream of orders (keyed on customerid, source is socket) > b) I filter for those orders that is from my high value customers (I have > to make sure I have this list of high value customers available on all bolt > tasks in memory for fast correlation/projection), so customer id in > streams correlated to customer id in the list and if the customer type is > in platinum and gold > c) Count the orders/amount for last 5 minutes and group by products, > customer type >
If you want to join against a customer table, divide into low and high value and do micro-batched 5 minute summaries for these categories, then Spark Streaming is an ideal match. Storm isn't a bad match at all either, especially since the number of orders is likely to be small enough that none of the approaches are going to be stressed performance wise.
