How about adding a extra field (user id or job id) to data chunks and use that field to distiguish tasks.
Thanks Milinda On Wed, Mar 19, 2014 at 11:21 AM, Eugene Dzhurinsky <[email protected]> wrote: > Hello! > > I'm evaluating Storm for the project, which involves processing of many > distinct small tasks in the following way: > > - a user supplies some data source > > - spout is attached to the source and produces chunks of data to the topology > > - bolts are being processing the chunk of data and transform it somehow (in > general > reducing the number of chunks, so number of records in sink are much less > than number of records out of the spout) > > - when all records are processed - the results are accumulated and sent back > to the user. > > As far as I understand, a topology is supposed to be kept running forever, so > I don't really see the easy way to "distinguish" the records from one task > from records of another one. Should a new topology be started for each new > task of a user? > > Thank you in advance! The links to any appropriate articles are very welcome > :) > > -- > Eugene N Dzhurinsky -- Milinda Pathirage PhD Student | Research Assistant School of Informatics and Computing | Data to Insight Center Indiana University twitter: milindalakmal skype: milinda.pathirage blog: http://milinda.pathirage.org
