Github user roshannaik commented on a diff in the pull request: https://github.com/apache/storm/pull/2502#discussion_r167385590 --- Diff: storm-client/src/jvm/org/apache/storm/executor/spout/SpoutOutputCollectorImpl.java --- @@ -44,14 +48,15 @@ private final Boolean isEventLoggers; private final Boolean isDebug; private final RotatingMap<Long, TupleInfo> pending; + private TupleInfo globalTupleInfo = new TupleInfo(); // thread safety: assumes Collector.emit*() calls are externally synchronized (if needed). --- End diff -- Accessing the Thread Local (TL) instance via ThreadLocal.get() typically involves a map lookup behind the scenes. **Related Note:** I reluctantly used TL for latching on to JCQueue.BatchInserter instance for the producers to JCQueue. Reluctant since I noticed perf hit when doing some targeted microbenchmarking. I used it anyway because it was a perf improvement over the ConcurrentHashMap employed in Disruptor, and eliminating TL needed a bigger change to the interface and producers. I think it is possible to achieve TL free JCQueue and gain some perf.. perhaps in a follow up jira. Although many decisions were measured, due to scope it was not feasible to measure each one. So, in the critical path, I have taken this general approach of : - avoid locks & synchronization ... and try using lock/wait-free approaches where synchronization is unavoidable. - avoid map lookups and object creation This was a case of avoiding synchronization, (TL) map lookups & object allocation.
---