I want to implement a topology that is similar to the RollingTopWords
topology in the Storm examples
<https://github.com/apache/incubator-storm/tree/master/examples/storm-starter>.
The idea is to count the frequency of words emitted. Basically, the spouts
emit words at random, the first level bolts count the frequency and pass
them on. The twist is that I want the bolts to pass on the frequency of a
word only if its frequency in one of the bolts exceeded a threshold. So,
for example, if the word "Nathan" passed the threshold of 5 occurrences
within a time window on one bolt then all bolts would start passing
"Nathan"'s frequency onwards.

What I thought of doing is having another layer of bolts which would have
the list of words which have passed a threshold. They would then receive
the words and frequencies from the previous layer of bolts and pass them on
only if they appear in the list. Obviously, this list would have to be
synchronized across the whole layer of bolts.

Is this a good idea? What would be the best way of implementing it?

Reply via email to