Roshan Naik created STORM-1526: ---------------------------------- Summary: Improve Storm core performance Key: STORM-1526 URL: https://issues.apache.org/jira/browse/STORM-1526 Project: Apache Storm Issue Type: Bug Reporter: Roshan Naik
Profiling a Speed of Light toplogy running on Storm core without ACKers is showing: - Call tree info : shows that a big part of the nextTuple() invocation is consumed in the SpoutOutputCollector.emit() call. 20% of it goes in Reflection by the clojure code Method Stats view : Shows that a lot of time is spent blocking on the disruptor queue The performance issue is narrowed down to this Clojure code in executor.clj : {code} (defn mk-custom-grouper [^CustomStreamGrouping grouping ^WorkerTopologyContext context ^String component-id ^String stream-id target-tasks] (.prepare grouping context (GlobalStreamId. component-id stream-id) target-tasks) (if (instance? LoadAwareCustomStreamGrouping grouping) (fn. [task-id ^List values load] (.chooseTasks grouping task-id values load)) ; <-- problematic invocation (fn [task-id ^List values load] (.chooseTasks grouping task-id values)))) {code} *grouping* is statically typed to the base type CustomStreamGrouping. In this run, its actual type is the derived type LoadAwareCustomStreamGrouping. The base type does not have a chooseTasks() method with 3 args. Only the derived type has that method. Consequently clojure falls back to dynamically iterating over the methods in the *grouping* object to locate the right method & then invoke it appropriately. This falls in the critical path SpoutOutputCollector.emit() where it takes about ~20% time .. just locating the right method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)