The big difference that I see is that Spark streaming inherently does
micro-batching.  Storm can do or not do it.




On Mon, Jun 9, 2014 at 1:40 PM, Machiel Groeneveld <[email protected]>
wrote:

> I know Storm fairly well, Spark is probably new to everyone, I haven't
> used it yet. Some thoughts on where Spark Streaming would be a more natural
> fit than Storm
> Spark Streaming
> + Counting messages or do other statistics on them
> + Sliding windows on streams
> - Programming model (Spark seems be one big procedure for the entire
> process)
> - Maturity?
>
> Storm
> + More complex topologies (complexity is delegated to the bolts)
> + Always concurrent (Spark only for some operations)
> + Multiple receivers for one message
> + Maturity
>
> Maybe it's the examples, but Spark seems to be geared towards ad-hoc
> programming. I'm curious what a complex application would look like. Also
> performance compared to Storm is a question mark. Storm has a great
> programming model which on first glance looks more universal than Spark's.
>
>
>
>
> On 9 June 2014 22:04, Rajiv Onat <[email protected]> wrote:
>
>> Thanks. Not sure why you say it is different, from a stream processing
>> use case perspective both seems to accomplish the same thing while the
>> implementation may take different approaches. If I want to aggregate and do
>> stats in Storm, I would have to microbatch the tuples at some level. e.g.
>> count of orders in last 1 minute, in Storm I have to write code to for
>> sliding windows and state management, while Spark seems to provide
>> operators to accomplish that. Tuple level operations such as enrichment,
>> filters etc.. seems also doable in both.
>>
>>
>> On Mon, Jun 9, 2014 at 12:24 PM, Ted Dunning <[email protected]>
>> wrote:
>>
>>>
>>> They are different.
>>>
>>> Storm allows right now processing of tuples.  Spark streaming requires
>>> micro batching (which may be a really short time).  Spark streaming allows
>>> checkpointing of partial results in the stream supported by the framework.
>>>  Storm says you should roll your own or use trident.
>>>
>>> Applications that fit one like a glove are likely to bind a bit on the
>>> other.
>>>
>>>
>>>
>>>
>>> On Mon, Jun 9, 2014 at 12:16 PM, Rajiv Onat <[email protected]> wrote:
>>>
>>>> I'm trying to figure out whether these are competitive technologies for
>>>> stream processing or complimentary? From the initial read, from a stream
>>>> processing capabilities both provides a framework for scaling while Spark
>>>> has window constructs, Apache Spark has a Spark Streaming and promises one
>>>> platform for batch, interactive and stream processing.
>>>>
>>>> Any comments or thoughts?
>>>>
>>>
>>>
>>
>

Reply via email to