Hi and thanks .

I'm working on a parrallel algorithm, which is to count massive items in data 
streams. The previous researches on the parallelism of this algorithm were 
focusing on muti-core CPU, however, I want to take advantage of Storm.

Processing latency is extremly important for this algorithm, and I did some 
evaluation of the perfomance.

Firstly,  I implemented the algorithm in java(one thread, with no parallelism) 
and I get the performance : it could process 3 million items per second.

Secondly,  I wrapped this implement of the algorithm into Storm(just one Spout 
to process) and I get the perfomance: it could process only 0.75 million items 
per second. I changes a little bit of my impletment to adapt Storm structure, 
but in the end the perfomance is still not good....

ps. I didn't take the network overhead into consideration because I just run 
the program in the single Spout node so that there is no emit or transfer.(so I 
don't care how storm emits messages between nodes for now ) The program on 
Spout is actually doing the same thing as the former one.(I just copy the 
program into the NextTuple() method with some necessary changes)

1. The degration(1/4 of the speed) is inevitable? 
2. What incurred the degration?
3. How can I reduce the degration?

Thank you all.




[email protected]

Reply via email to