The big difference that I see is that Spark streaming inherently does micro-batching. Storm can do or not do it.
On Mon, Jun 9, 2014 at 1:40 PM, Machiel Groeneveld <[email protected]> wrote: > I know Storm fairly well, Spark is probably new to everyone, I haven't > used it yet. Some thoughts on where Spark Streaming would be a more natural > fit than Storm > Spark Streaming > + Counting messages or do other statistics on them > + Sliding windows on streams > - Programming model (Spark seems be one big procedure for the entire > process) > - Maturity? > > Storm > + More complex topologies (complexity is delegated to the bolts) > + Always concurrent (Spark only for some operations) > + Multiple receivers for one message > + Maturity > > Maybe it's the examples, but Spark seems to be geared towards ad-hoc > programming. I'm curious what a complex application would look like. Also > performance compared to Storm is a question mark. Storm has a great > programming model which on first glance looks more universal than Spark's. > > > > > On 9 June 2014 22:04, Rajiv Onat <[email protected]> wrote: > >> Thanks. Not sure why you say it is different, from a stream processing >> use case perspective both seems to accomplish the same thing while the >> implementation may take different approaches. If I want to aggregate and do >> stats in Storm, I would have to microbatch the tuples at some level. e.g. >> count of orders in last 1 minute, in Storm I have to write code to for >> sliding windows and state management, while Spark seems to provide >> operators to accomplish that. Tuple level operations such as enrichment, >> filters etc.. seems also doable in both. >> >> >> On Mon, Jun 9, 2014 at 12:24 PM, Ted Dunning <[email protected]> >> wrote: >> >>> >>> They are different. >>> >>> Storm allows right now processing of tuples. Spark streaming requires >>> micro batching (which may be a really short time). Spark streaming allows >>> checkpointing of partial results in the stream supported by the framework. >>> Storm says you should roll your own or use trident. >>> >>> Applications that fit one like a glove are likely to bind a bit on the >>> other. >>> >>> >>> >>> >>> On Mon, Jun 9, 2014 at 12:16 PM, Rajiv Onat <[email protected]> wrote: >>> >>>> I'm trying to figure out whether these are competitive technologies for >>>> stream processing or complimentary? From the initial read, from a stream >>>> processing capabilities both provides a framework for scaling while Spark >>>> has window constructs, Apache Spark has a Spark Streaming and promises one >>>> platform for batch, interactive and stream processing. >>>> >>>> Any comments or thoughts? >>>> >>> >>> >> >
