GitHub user arunmahadevan opened a pull request:
https://github.com/apache/storm/pull/2835
STORM-3222: Fix KafkaSpout internals to use LinkedList instead of ArrayList
KafkaSpout internally maintains a waitingToEmit list per topic partition
and keeps removing the first item to emit during each nextTuple. The
implementation uses an ArrayList which results in un-necessary traversal and
copy for each tuple.
Also I am not sure why the nextTuple only emits a single tuple wheres
ideally it should emit whatever it can emit in a single nextTuple call which is
more efficient. However the logic appears too complicated to refactor.
https://github.com/apache/storm/pull/2829 for 1.x
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/arunmahadevan/storm STORM-3222-1.x
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/storm/pull/2835.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2835
commit 2e0a01e5e4e36169ef2a45bf5dcd72792231ee07
Author: Arun Mahadevan
Date: 2018-09-12T22:36:24Z
STORM-3222: Fix KafkaSpout internals to use LinkedList instead of ArrayList
KafkaSpout internally maintains a waitingToEmit list per topic partition
and keeps removing the first item to emit during each nextTuple. The
implementation uses an ArrayList which results in un-necessary traversal and
copy for each tuple.
Also I am not sure why the nextTuple only emits a single tuple wheres
ideally it should emit whatever it can emit in a single nextTuple call which is
more efficient. However the logic appears too complicated to refactor.
---