[
https://issues.apache.org/jira/browse/STORM-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996578#comment-14996578
]
Robert Joseph Evans commented on STORM-1190:
--------------------------------------------
This is most likely due to the disruptor queue batching.
https://github.com/apache/storm/pull/765
The experiments showed that the CPU utilization under light load increased
significantly, but the throughput at higher loads doubled.
https://github.com/apache/storm/pull/765#issuecomment-149987537
You can try to mitigate this by setting topology.disruptor.batch.size to 1, and
setting topology.disruptor.batch.timeout.millis to something large like 1000.
If this works for you I will put some special case code for a batch size of 1,
that should drop the CPU utilization back to where it was before, but you will
also lose the increased throughput.
> System load spikes in recent snapshot
> -------------------------------------
>
> Key: STORM-1190
> URL: https://issues.apache.org/jira/browse/STORM-1190
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-core
> Affects Versions: 0.11.0
> Environment: 10x (CoreOS stable (766.4.0) / k8s 1.0.1 / docker
> running on Azure VMs)
> Reporter: Michael Schonfeld
> Priority: Critical
> Attachments: Screenshot 2015-11-08 22.17.57.png, Screenshot
> 2015-11-08 22.18.06.png
>
>
> We've been running Storm's snapshots on our production cluster for a little
> while now (that back pressure support really helped us), and we've noticed a
> sudden spike in system load when going from
> commit@ba1250993d10ffc523c9f5464371fbeb406d216f to the current latest
> commit@c12e28c829fcfabc0a3a775fb9714968b7e3e349. Both versions were running
> the exact same topologies, and there was no significant change in workload.
> Not exactly sure how to even begin to debug this, so we ended up just rolling
> back. Thoughts?
> Stats screenshots attached
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)