Re: Is Storm a right tool for processing of thousands of small tasks?

Milinda Pathirage Wed, 19 Mar 2014 10:04:15 -0700

How about adding a extra field (user id or job id) to data chunks and
use that field to distiguish tasks.


Thanks
Milinda

On Wed, Mar 19, 2014 at 11:21 AM, Eugene Dzhurinsky <[email protected]> wrote:
> Hello!
>
> I'm evaluating Storm for the project, which involves processing of many
> distinct small tasks in the following way:
>
> - a user supplies some data source
>
> - spout is attached to the source and produces chunks of data to the topology
>
> - bolts are being processing the chunk of data and transform it somehow (in 
> general
> reducing the number of chunks, so number of records in sink are much less
> than number of records out of the spout)
>
> - when all records are processed - the results are accumulated and sent back
> to the user.
>
> As far as I understand, a topology is supposed to be kept running forever, so
> I don't really see the easy way to "distinguish" the records from one task
> from records of another one. Should a new topology be started for each new
> task of a user?
>
> Thank you in advance! The links to any appropriate articles are very welcome 
> :)
>
> --
> Eugene N Dzhurinsky



-- 
Milinda Pathirage

PhD Student | Research Assistant
School of Informatics and Computing | Data to Insight Center
Indiana University

twitter: milindalakmal
skype: milinda.pathirage
blog: http://milinda.pathirage.org

Re: Is Storm a right tool for processing of thousands of small tasks?

Reply via email to