Hello, I seems to be getting decreasing performance for each additional bolt I add to a topology. I would like to know what performance decreases to expect when making longer topologies. Here is my current topology:
Spout —shuffleGrouping--> Bolt1 —fieldGrouping--> Bolt2 1 x Spout, 1 x Bolt1, 1 x Bolt2. 3 workers, 3 ackers, each component has 1 task and 1 executor. Nimbus set up with 3 AWS supervisors which are c4.xlarge. Each AWS intense has only one worker port available (6000 I think) meaning one worker/component/executor per AWS instance. The topology uses message guaranteeing (sending an id with the spout tuple and anchoring the tuple in bolts). I think the average tuple size is <= 450 bytes. The maxTuplePending is set to 35 (During testing we found 3 ackers and 35 max pending was “good") When testing with only the Spout (no bolts) I was emitting about 70,000 per second. When adding the first bolt this it then dropped to about 10,000 - 14,000 per second Finally, adding the final bolt dropped me down to about 5,000 per second. When looking at Storm ui the spout takes about 11ms (complete latency, when running full topology) And each spout takes about 0.2ms to execute. When running only the Spout the complete latency is a lot less, maybe 2-3ms. I understand that message guaranteeing takes some performance, but it supposed to drop this much? Is this a networking issue? My AWS instances have “HIGH” at the networking performance and they are in a placement group. Thanks for any feedback. Sorry if this is the wrong mailing group. I will add any information you need in a reply.
