Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2261 I got back to 8f63d5a which doesn't touch any interfaces and do same tests: Grouping | transferred (messages) | transfer rate (message/s) | spout_transferred | spout_acks | spout_throughput (acks/s) -- | -- | -- | -- | -- | -- New LocalOrShuffle (patch) | 160441600 | 2674026 | 160441600 | 160441580 | 2674026 Now it is a bit slower than ShuffleGrouping but still faster than LoadAwareShuffleGrouping (about 22%). So we can choose either better improvement with touching multiple parts or still great improvement without touching other parts. I have tested another thing, replacing List with Array in ShuffleGrouping. Test result is below: Grouping | transferred (messages) | transfer rate (message/s) | spout_transferred | spout_acks | spout_throughput (acks/s) -- | -- | -- | -- | -- | -- LocalOrShuffle with loadaware disabled (master) | 161437800 | 2690630 | 161437800 | 161437760 | 2690630 It doesn't seem to bring noticeable improvement. The difference may be the length of the array: the array is too small (would have 1 element) in test and had to call another `set()` in addition to `incrementAndGet()` for every time. Please note that the length of array in the patch is 1000, so `set()` is called every 1000 times. We could grow the array in `prepare()` to get better performance, but that's going to be a micro-optimization and I'm not clear we would need to apply.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---