I know Storm fairly well, Spark is probably new to everyone, I haven't used
it yet. Some thoughts on where Spark Streaming would be a more natural fit
than Storm
Spark Streaming
+ Counting messages or do other statistics on them
+ Sliding windows on streams
- Programming model (Spark seems be one big procedure for the entire
process)
- Maturity?

Storm
+ More complex topologies (complexity is delegated to the bolts)
+ Always concurrent (Spark only for some operations)
+ Multiple receivers for one message
+ Maturity

Maybe it's the examples, but Spark seems to be geared towards ad-hoc
programming. I'm curious what a complex application would look like. Also
performance compared to Storm is a question mark. Storm has a great
programming model which on first glance looks more universal than Spark's.




On 9 June 2014 22:04, Rajiv Onat <[email protected]> wrote:

> Thanks. Not sure why you say it is different, from a stream processing use
> case perspective both seems to accomplish the same thing while the
> implementation may take different approaches. If I want to aggregate and do
> stats in Storm, I would have to microbatch the tuples at some level. e.g.
> count of orders in last 1 minute, in Storm I have to write code to for
> sliding windows and state management, while Spark seems to provide
> operators to accomplish that. Tuple level operations such as enrichment,
> filters etc.. seems also doable in both.
>
>
> On Mon, Jun 9, 2014 at 12:24 PM, Ted Dunning <[email protected]>
> wrote:
>
>>
>> They are different.
>>
>> Storm allows right now processing of tuples.  Spark streaming requires
>> micro batching (which may be a really short time).  Spark streaming allows
>> checkpointing of partial results in the stream supported by the framework.
>>  Storm says you should roll your own or use trident.
>>
>> Applications that fit one like a glove are likely to bind a bit on the
>> other.
>>
>>
>>
>>
>> On Mon, Jun 9, 2014 at 12:16 PM, Rajiv Onat <[email protected]> wrote:
>>
>>> I'm trying to figure out whether these are competitive technologies for
>>> stream processing or complimentary? From the initial read, from a stream
>>> processing capabilities both provides a framework for scaling while Spark
>>> has window constructs, Apache Spark has a Spark Streaming and promises one
>>> platform for batch, interactive and stream processing.
>>>
>>> Any comments or thoughts?
>>>
>>
>>
>

Reply via email to