Re: Storm throughput

Roshan Naik Fri, 30 Mar 2018 22:20:06 -0700

 
Something is definitely broken in your run or in your measurement method.... 
and its not your hardware that is at fault. The machine on which those numbers 
were run had lots of cores but the cores were not fast at all. Even my mid 2015 
macbook pro has faster cores than that machine which had old Intel CPUs.
You maybe making some mistakes in your calculations. Just run the topo for 
about 14 mins and take the 10 min window reading directly from the UI and 
calculate the per sec throughput from that. (that way you disregard the first 3 
or 4mins to allow for warm up). Also are you overriding any default settings ?

Here is the code for the topo that was used :
https://github.com/apache/storm/blob/1.1.x-branch/examples/storm-perf/src/main/java/org/apache/storm/perf/ConstSpoutOnlyTopo.java

-roshan On Friday, March 30, 2018, 8:24:39 AM PDT, Alessio Pagliari
<[email protected]> wrote:

Surely they work on a way more powerful cluster, but the topology is composed
by just one spout. No parallelization, no bolts, for a total of one worker, so
1 thread in a jvm. Even if I had 100 cores like them it shouldn't make any
difference. Please, correct me if I'm wrong.

Such a topology will assign it's only spout to a worker in a node: so, the
multi-node cluster is pointless. Meanwhile, regarding the number of cores, one
executor cannot be at the same time on multiple cores, not being a multi-thread
process.

Is there some Storm or Java behavior that I'm not aware of?

Thank you,

Alessio

Sent from BlueMail On Mar 30, 2018, at 4:28 PM, Jacob Johansen
<[email protected]> wrote:
for their test, they were using 4 worker nodes (servers) each with 24vCores
for a total of 96vCores. Most laptops max out at 8vCores and are typically at
4-6vCores
Jacob Johansen

On Fri, Mar 30, 2018 at 9:18 AM, Alessio Pagliari <[email protected]>
wrote:

Hi everybody,
I’m trying to do some preliminary tests with storm, to understand how far it
can go. Now I’m focusing on trying to understand which is his maximum
throughput in terms of tuples per second. I saw the benchmark done by the guys
at Hortonworks (ref: https://it.hortonworks. com/blog/microbenchmarking-
storm-1-0-performance/) and in the first test they reach a spout emission rate
of 3.2 million tuples/s.
I tried to replicate the test, a simple spout that emits continuously the
same string “some data”. Differently from them, I’m using Storm 1.1.1 and the
storm cluster is set up on my laptop, anyway I’m just testing one spout not an
entire topology, but if you think that more configuration information are
needed, just ask.
To compute the throughput I ask the total amount of tuples processed to the
UI APIs each 10s and I subtract it by the previous measure to have the amount
of tuples int the last 10s. What the mathematics give to me is something around
32k tuples/s.
I don’t think to be wrong saying that 32k is not even comparable to 3.2
million. Is there something that I’m missing? Is it normal this output?
Thank you for your help and for your time,
Alessio

Re: Storm throughput

Reply via email to