Storm can work for it, but in general you won't get much out of Storm if your topologies end up being a single bolt doing a single task. At that point it degenerates to a simple job framework. The benefit you do get is code reuse over the spouts, but you can also just generalize over your worker template.
We had a lot of topologies that were essentially like this and ended up moving off Storm for them because we found it difficult to provide QoS across different users under the Storm (one user could end up sending tons of jobs and starving out other users). If you do have multi-stage jobs and/or certain pieces that need more CPU than others though, Storm can work pretty well. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> [email protected] On Thu, Jan 8, 2015 at 7:21 AM, noodles <[email protected]> wrote: > In our project, there are a *looooooooot* of background jobs, such as *parsing > web pages*, *downloading videos*, *determining web video's status*, etc. > According to our experiences, we evaluate that 50M jobs will need to be > consumed in a day. We are considering to use storm as the background > framework. But I have no experience on storm, so I wish a help here. > > Thanks > > -- > *Yeah, I'm noodles!* >
