GitHub user srdo opened a pull request: https://github.com/apache/storm/pull/2676
STORM-3073: Uncap pendingEmits for bolt executors, and prevent LoadSpout from overflowing pendingEmits in spout executors https://issues.apache.org/jira/browse/STORM-3073 The first commit contains the changes I made to ExclamationTopology to provoke the error. I'll remove it after review, it is just included so it's easier to understand the error. There are two changes in this PR. The first is to uncap the pendingEmits queue for bolt executors. It's currently capped at 1024 elements, which makes it dangerous for bolts to emit more than 1024 tuples in an execute invocation. If the bolt executor is experiencing backpressure and tries to add the tuples to pendingEmits, the queue size will be exceeded and the worker will crash. The second change is to make LoadSpout emit failed tuples from nextTuple instead of doing it from fail. Since the spout executor is also limited to 1024 tuples in the pending queue, it is likely that the spout executor will exceed the queue limit and crash if a bunch of tuples fail at the same time (e.g. due to timeout) while the spout is adding tuples to pendingEmits. Since the spout won't call nextTuple if there are tuples in pendingEmits, we can just move the retries to that method to prevent the spout from exceeding the pendingEmits limit. You can merge this pull request into a Git repository by running: $ git pull https://github.com/srdo/storm STORM-3073 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/2676.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2676 ---- commit 538adbbe2d6129ab36a596d612870cef80f3af5a Author: Stig Rohde Døssing <srdo@...> Date: 2018-05-15T13:09:54Z WIP test commit 9d5adc57bd3aa5dcdf4945196d7b7843fbddf2d1 Author: Stig Rohde Døssing <srdo@...> Date: 2018-05-15T13:13:09Z STORM-3073: Uncap pendingEmits for bolt executors, and prevent LoadSpout from overflowing pendingEmits in spout executors ---- ---