Attached is a screenshots of the performance profile for Storm core using a Speed of Light topology.
Topology info: - 1 bolt instance, 1 spout instance, 1 worker. - ACKer count = 0 - Spout generates precomputes a random list of tuples, then keeps emitting them endlessly - Bolt just remits the same tuple and lacks - localOrShuffleGrouping - Topology Code : https://github.com/roshannaik/storm-benchmark-sol/blob/master/src/main/java/storm/benchmark/benchmarks/SOL.java Observations: * Call tree info shows that a big part of the nextTuple() invocation is consumed in the Collector.emit() call. A major part of that goes in Reflection by the clojure code * Method Stats view shows that a lot of time is spent blocking on the disruptor queue