[
https://issues.apache.org/jira/browse/STORM-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14619828#comment-14619828
]
ASF GitHub Bot commented on STORM-929:
--------------------------------------
GitHub user errordaiwa opened a pull request:
https://github.com/apache/storm/pull/625
[STORM-929] fix issue of high cpu usage when bolt idle
### Issue
[STORM-929](https://issues.apache.org/jira/browse/STORM-929#)
### Phenomenon
1. Run a topology with large num of executors.
2. When a small number of tuples, CPU usage becomes abnormally high.
### Reason
1. Most threads wait at
```
com.lmax.disruptor.BlockingWaitStrategy.waitFor(long,
com.lmax.disruptor.Sequence, com.lmax.disruptor.Sequence[],
com.lmax.disruptor.SequenceBarrier, long, java.util.concurrent.TimeUnit)
```
2. When there are few tuple transfered, most of executors block on
DisruptorQueue.consumeBatchWhenAvailable method.
```
final long availableSequence = _barrier.waitFor(nextSequence, 10,
TimeUnit.MILLISECONDS);
```
When bolt num is large, since timeout is 10ms, there will be frequently
switching thread context lead to abnormally high CPU utilization.
### Fix
1. Change the timeout value from 10ms to 1000ms.
### Test
1. ENV
+ Storm 0.9.3
+ CPU (E5620, 8 core, 2.40GHz)
+ Centos 6.5, kernel 3.10.5-3.el6.x86_64
+ Word counter topology, change spout sleep time from 100ms to 10s and
num of counter bolt from 12 to 500.
2. DisruptorQueue wait time 10ms
+ 28% CPU usage, %13 user, %15 sys
3. DisruptorQueue wait time 1000ms
+ 5% CPU usage, %3 user, %2 sys
4. Execution latency do not increase.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/errordaiwa/storm STORM-929
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/storm/pull/625.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #625
----
commit 5141440bbd267ca954b19b990b77781b7795bd7b
Author: errordaiwa <[email protected]>
Date: 2015-07-09T02:40:20Z
fix issue of high cpu usage when bolt idle
----
> High CPU usage when bolt idle due to short disruptor queue wait time
> --------------------------------------------------------------------
>
> Key: STORM-929
> URL: https://issues.apache.org/jira/browse/STORM-929
> Project: Apache Storm
> Issue Type: Bug
> Affects Versions: 0.9.3
> Reporter: Xingyu Su
>
> I'm running topology which has large num of executors (500) on storm 0.9.3. I
> find the CPU usage over 100% when topology idle. And half of the CPU usage is
> from kernel. I look into CPU utilization of worker process and find most of
> threads wait on:
> com.lmax.disruptor.BlockingWaitStrategy.waitFor(long,
> com.lmax.disruptor.Sequence, com.lmax.disruptor.Sequence[],
> com.lmax.disruptor.SequenceBarrier, long, java.util.concurrent.TimeUnit)
> I use Storm starter topology (wordcounter) to reproduce this issue. I change
> the sleep time of spout to 10s and executor num of bolt to 500. So there was
> effectively no task to do. Again the CPU usage comes to 100% and half from
> kernel. I think this may caused by frequently switching thread context due to
> short disruptor queue wait time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)