I'm also interested on the answers to this question, but to add to the
discussion, take a look at
http://aadrake.com/command-line-tools-can-be-235x-faster-than-your-hadoop-cluster.html.
I suspect Storm is still introducing coordination overhead even running on
a single machine.
On Tue, 12 May 2015 at 1:39 pm [email protected] <[email protected]>
wrote:

>   Hi and thanks .
>
> I'm working on a parrallel algorithm, which is to count massive items in
> data streams. The previous researches on the parallelism of this algorithm
> were focusing on muti-core CPU, however, I want to take advantage of Storm.
>
> Processing latency is extremly important for this algorithm, and I did
> some evaluation of the perfomance.
>
> Firstly,  I implemented the algorithm in java(one thread, with no
> parallelism) and I get the performance : it could process 3 million items
> per second.
>
> Secondly,  I wrapped this implement of the algorithm into Storm(just one
> Spout to process) and I get the perfomance: it could process only 0.75
> million items per second. I changes a little bit of my impletment to adapt
> Storm structure, but in the end the perfomance is still not good....
>
> ps. I didn't take the network overhead into consideration because I just
> run the program in the single Spout node so that there is no emit or
> transfer.(so I don't care how storm emits messages between nodes for now
> ) The program on Spout is actually doing the same thing as the former
> one.(I just copy the program into the NextTuple() method with some
> necessary changes)
>
> 1. The degration(1/4 of the speed) is inevitable?
> 2. What incurred the degration?
> 3. How can I reduce the degration?
>
> Thank you all.
>
> ------------------------------
>  [email protected]
>

Reply via email to