Re: Use apache storm to propagate updates in a NoSQL architecture

Kishore Senji Sun, 16 Aug 2015 00:17:07 -0700

If the number of such events (user name update) in your system is not too
huge then you can do it online (i.e you can acknowledge to the user when it
interacts on your site to update his name only when you update all the
documents that are relevant in your NoSql db). But if there are too many
events and you would like to process them asynchronously so that the
latency to the end user is low then you can decouple them to a nearline
system where you stage such events in a Kafka queue. You have have some
modules running to pull the Kafka queue and update all the systems you have
to. Even for this you may not need Storm. But you can use Storm if you view
it as PaaS system. Taking care of fault tolerance of failed nodes and
pushing out new code to all your nodes is also not an easy task to
maintain. Storm does this for you for free. So you can use Storm which
pulls from the Kafka queue and updates the appropriate data stores. If you
just package the bolt and give it to Storm it will make sure that a number
of instances running for you so it acts as a PaaS system and you do not
have to keep monitoring your otherwise batch system.


On Sat, Aug 15, 2015 at 4:40 AM John Yost <[email protected]> wrote:

> Hi Michel,
>
> I am actually doing something very similar.  I am processing data coming
> in from a KafkaSpout in Bolt A and then sending the processed tuples to
> Bolt B where I cache 'em until the collections reach a certain size and
> then flush the key/value pairs to SequenceFiles that are then loaded into
> HBase.
>
> My topology is working well except for the Bolt A to Bolt B step, but got
> some great feedback and ideas from Javier and Kishore, and will apply their
> thoughts.  I think you are on the right track, especially given the message
> processing guarantees embedded within Storm.
>
> --John
>
> On Fri, Aug 14, 2015 at 2:17 PM, Michel Blase <[email protected]> wrote:
>
>> Hi all,
>>
>> I'm very new to apache-storm and I have to admit I don't have a lot of
>> experience with NoSQL in general either.
>>
>> I'm modelling my data using a document-based approach and I'm trying to
>> figure out how to update versions of a (sub) document stored in different
>> "documents". It's the classic scenario where you store user's info in the
>> comments table. Updates to the user's name (for example) should be
>> propagated to all the comments.
>>
>> My understanding is that in this scenario people would trigger a
>> procedure on the user's name update that scans all the related documents to
>> update the user's name.
>>
>> I was considering using apache-storm to propagate updates and I would
>> like to have some feedback from more experienced developers on this kind of
>> problems.
>>
>> Would apache-storm be too much? Should I just use zookeeper?
>>
>> My understanding is that apache-storm is mostly used for complex data
>> manipulations and here all I need to do is to keep the data in sync for
>> consistency when accessed by users. Am I going the wrong direction? How do
>> you guys solve this kind of problems?
>>
>> Thanks,
>> Michel
>>
>
>

Re: Use apache storm to propagate updates in a NoSQL architecture

Reply via email to