Hi,

The problem I'm trying to solve requires each task of a particular bolt to
access a common data structure, which is shared by the tasks (and also
update this data structure).



*The problem*

The tuples in my data stream are just numbers. I need to maintain two
'buckets' of data points, each containing a sum of numeric data points seen
previously. For example, when a new data point (a number *'x'*) is emitted,
it will be added on to..


   - Bucket 1 if 'x' is greater than the average of data points added up to
   now to Bucket 1



   - Bucket 2 if 'x' is lesser than the average of data points added up to
   now to Bucket 2




*What I need to know*

Can a problem like this be solved using Storm? That is, when a decision has
to be made based *not only on the current incoming tuple*, but also on
a *common
data structure* across tasks of the same bolt?



I'm planning on using Storm to solve this problem as the data point
allocation decision (which is actually more complex than what I've
described here) takes time on a single machine, and I'm looking to
distribute the computation to achieve higher data stream rates.




Would highly appreciate any help.


Thanks,

Best Regards,
Kosala

Reply via email to