Re: Storm patterns vis-a-vis external data storage

Hemanth Yamijala Wed, 07 Jan 2015 22:28:57 -0800

Itai & Jens,

Thank you for sharing your thoughts. My requirement is what Jens has
referred to as "export" data from my topology outside.


I can clearly see the benefits of segregating this functionality to another
bolt - for e.g. to scale it independently of the processing bolts, or for
accommodating changes.

The only negative (if it is that) seems to be the increase in number of
runtime bolt instances in the topology. I understand that it can be solved
with more hardware resources and the horizontal scalability of Storm. Also,
it might be hard to quantify this precisely, given the different scaling
requirements for processing and I/O bound bolts. Do you see this as a
concern ?

Thanks
hemanth

On Wed, Jan 7, 2015 at 9:39 PM, Jens-U. Mozdzen <[email protected]> wrote:

> Hi Hemanth,
>
> Zitat von Hemanth Yamijala <[email protected]>
>
>> Hi all,
>>
>> I guess it is common to build topologies where message processing in
>> storm results in data that should be stored in external stores like NoSQL
>> DBs or message queues like Kafka.
>>
>> There are two broad approaches to handle this storage:
>>
>> 1) Inline the storage functionality with the processing functionality -
>> i.e. the bolt generating the info to be stored also takes care of storing
>> it.
>> 2) Separate out the two and make a downstream bolt responsible for the
>> storage.
>>
>> Just wanted to see if people on the list think if there are advantages to
>> favour one approach over the other. Any pitfalls to take care of in one
>> case over the other.
>>
>
> I'd say: it depends ;) In case of aggregation bolts that persist their
> states, you may want to limit the memory footprint of each bolt instance.
> Thus implementing an in-mem cache for persisted data is pretty helpful, but
> means to incorporate persistence access per-bolt.
>
> OTOH, if you plan to "export" data from your topology (which seems to be
> the main focus of your question), separating calculation and "export" into
> separate bolts seems a natural choice to me - especially when you consider
> future changes (i.e. to support a different or possibly *additional* export
> paths - you can keep the "tuple interface" as it is and simply connect
> different and/or additional export bolts).
>
> Regards,
> Jens
>
>

Re: Storm patterns vis-a-vis external data storage

Reply via email to