> Something is definitely broken in your run or in your measurement method….
The problem doesn’t lie in my measurement method, I double checked trying as you said. Thank you for sharing the topo used, with that I was able to understand where I was failing. Basing my topo on other sample benchmark topologies that I found online, I enabled the option setDebug(true); printing an output message for each tuple was slowing me down, now I’m able to reach a spout emission rate of ~4.5M tuples per second. Thank you all for the support. ---------- Alessio Pagliari Scale Team, PhD Student Université Côte d’Azur, CNRS, I3S > On 31 Mar 2018, at 07:19, Roshan Naik <[email protected]> wrote: > > > Something is definitely broken in your run or in your measurement method.... > and its not your hardware that is at fault. The machine on which those > numbers were run had lots of cores but the cores were not fast at all. Even > my mid 2015 macbook pro has faster cores than that machine which had old > Intel CPUs. > > You maybe making some mistakes in your calculations. Just run the topo for > about 14 mins and take the 10 min window reading directly from the UI and > calculate the per sec throughput from that. (that way you disregard the first > 3 or 4mins to allow for warm up). Also are you overriding any default > settings ? > > > Here is the code for the topo that was used : > https://github.com/apache/storm/blob/1.1.x-branch/examples/storm-perf/src/main/java/org/apache/storm/perf/ConstSpoutOnlyTopo.java > > <https://github.com/apache/storm/blob/1.1.x-branch/examples/storm-perf/src/main/java/org/apache/storm/perf/ConstSpoutOnlyTopo.java> > > > > > -roshan > On Friday, March 30, 2018, 8:24:39 AM PDT, Alessio Pagliari > <[email protected]> wrote: > > > Surely they work on a way more powerful cluster, but the topology is composed > by just one spout. No parallelization, no bolts, for a total of one worker, > so 1 thread in a jvm. Even if I had 100 cores like them it shouldn't make any > difference. Please, correct me if I'm wrong. > > Such a topology will assign it's only spout to a worker in a node: so, the > multi-node cluster is pointless. Meanwhile, regarding the number of cores, > one executor cannot be at the same time on multiple cores, not being a > multi-thread process. > > Is there some Storm or Java behavior that I'm not aware of? > > Thank you, > > Alessio > > Sent from BlueMail <http://www.bluemail.me/r?b=12512> > On Mar 30, 2018, at 4:28 PM, Jacob Johansen <[email protected] > <mailto:[email protected]>> wrote: > for their test, they were using 4 worker nodes (servers) each with 24vCores > for a total of 96vCores. > Most laptops max out at 8vCores and are typically at 4-6vCores > > Jacob Johansen > > On Fri, Mar 30, 2018 at 9:18 AM, Alessio Pagliari <[email protected] > <mailto:[email protected]>> wrote: > Hi everybody, > > I’m trying to do some preliminary tests with storm, to understand how far it > can go. Now I’m focusing on trying to understand which is his maximum > throughput in terms of tuples per second. I saw the benchmark done by the > guys at Hortonworks (ref: https://it.hortonworks. > com/blog/microbenchmarking- storm-1-0-performance/ > <https://it.hortonworks.com/blog/microbenchmarking-storm-1-0-performance/>) > and in the first test they reach a spout emission rate of 3.2 million > tuples/s. > > I tried to replicate the test, a simple spout that emits continuously the > same string “some data”. Differently from them, I’m using Storm 1.1.1 and the > storm cluster is set up on my laptop, anyway I’m just testing one spout not > an entire topology, but if you think that more configuration information are > needed, just ask. > > To compute the throughput I ask the total amount of tuples processed to the > UI APIs each 10s and I subtract it by the previous measure to have the amount > of tuples int the last 10s. What the mathematics give to me is something > around 32k tuples/s. > > I don’t think to be wrong saying that 32k is not even comparable to 3.2 > million. Is there something that I’m missing? Is it normal this output? > > Thank you for your help and for your time, > > Alessio >
