[ 
https://issues.apache.org/jira/browse/STORM-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14619828#comment-14619828
 ] 

ASF GitHub Bot commented on STORM-929:
--------------------------------------

GitHub user errordaiwa opened a pull request:

    https://github.com/apache/storm/pull/625

    [STORM-929] fix issue of high cpu usage when bolt idle

    ### Issue
    [STORM-929](https://issues.apache.org/jira/browse/STORM-929#)
    ### Phenomenon
    1. Run a topology with large num of executors.
    2. When a small number of tuples, CPU usage becomes abnormally high.
    
    ### Reason
    1. Most threads wait at
    
        ```
    com.lmax.disruptor.BlockingWaitStrategy.waitFor(long, 
com.lmax.disruptor.Sequence, com.lmax.disruptor.Sequence[], 
com.lmax.disruptor.SequenceBarrier, long, java.util.concurrent.TimeUnit)
    ```
    2. When there are few tuple transfered, most of executors block on 
DisruptorQueue.consumeBatchWhenAvailable method.
    
        ```
    final long availableSequence = _barrier.waitFor(nextSequence, 10, 
TimeUnit.MILLISECONDS);
    ```
        When bolt num is large, since timeout is 10ms, there will be frequently 
switching thread context lead to abnormally high CPU utilization.
                
    ### Fix
    1. Change the timeout value from 10ms to 1000ms.
        
    ### Test
    1. ENV
        + Storm 0.9.3
        + CPU (E5620, 8 core, 2.40GHz)
            + Centos 6.5, kernel 3.10.5-3.el6.x86_64
        + Word counter topology, change spout sleep time from 100ms to 10s and 
num of counter bolt from 12 to 500.
    2. DisruptorQueue wait time 10ms
        + 28% CPU usage, %13 user, %15 sys 
     3. DisruptorQueue wait time 1000ms
        + 5% CPU usage, %3 user, %2 sys
    4. Execution latency do not increase.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/errordaiwa/storm STORM-929

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/625.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #625
    
----
commit 5141440bbd267ca954b19b990b77781b7795bd7b
Author: errordaiwa <[email protected]>
Date:   2015-07-09T02:40:20Z

    fix issue of high cpu usage when bolt idle

----


> High CPU usage when bolt idle due to short disruptor queue wait time
> --------------------------------------------------------------------
>
>                 Key: STORM-929
>                 URL: https://issues.apache.org/jira/browse/STORM-929
>             Project: Apache Storm
>          Issue Type: Bug
>    Affects Versions: 0.9.3
>            Reporter: Xingyu Su
>
> I'm running topology which has large num of executors (500) on storm 0.9.3. I 
> find the CPU usage over 100% when topology idle. And half of the CPU usage is 
> from kernel. I look into CPU utilization of worker process and find most of 
> threads wait on:
>      com.lmax.disruptor.BlockingWaitStrategy.waitFor(long, 
> com.lmax.disruptor.Sequence, com.lmax.disruptor.Sequence[], 
> com.lmax.disruptor.SequenceBarrier, long, java.util.concurrent.TimeUnit) 
> I use Storm starter topology (wordcounter) to reproduce this issue. I change 
> the sleep time of spout to 10s and executor num of  bolt to 500. So there was 
> effectively no task to do. Again the CPU usage comes to 100% and half from  
> kernel. I think this may caused by frequently switching thread context due to 
> short disruptor queue wait time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to