[ 
https://issues.apache.org/jira/browse/STORM-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956647#comment-13956647
 ] 

ASF GitHub Bot commented on STORM-12:
-------------------------------------

GitHub user revans2 opened a pull request:

    https://github.com/apache/incubator-storm/pull/57

    STORM-12 Reduce thread usage of Netty transport.

    Makes the netty clients share a thread pool, and never block any of the 
threads in the pool.  All existing tests pass, so no new tests were added.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/revans2/incubator-storm netty-thread-usage

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-storm/pull/57.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #57
    
----
commit 94c4d4d9e6c4ce736141668c585818214a9d26cf
Author: Robert (Bobby) Evans <[email protected]>
Date:   2014-04-01T15:32:31Z

    STORM-12 Reduce thread usage of Netty transport.

----


> Reduce Thread Usage of Netty Transport
> --------------------------------------
>
>                 Key: STORM-12
>                 URL: https://issues.apache.org/jira/browse/STORM-12
>             Project: Apache Storm (Incubating)
>          Issue Type: Improvement
>            Reporter: Robert Joseph Evans
>
> When users start to create large topologies the storm netty messaging layer
> uses lots of threads.  This has resulted in OOMs because the default ulimit 
> on most linux distros is around 4000 processes.  It looks like the messaging 
> layer wants to have one thread per server it is connected to, so that means 
> the total number of other workers in the System.
> For one particular case we saw.
>       1 (Curator delay thread)
>       1 (Curator Event Processor)
>       1 (Finalizer)
>       1 (GC???)
>       1 (Storm messaging recv thread asking netty for messages)
>       1 (Thread pool polling on a Synchronous queue???)
>       1 (ZK Connection)
>       1 (ZK epoll)
>       2 (???)
>       2 (Netty epoll)
>       6 (Timer Thread)
>      15 (Disruptor consume batches)
>     104 (Netty Thread pool taking messages to be sent)
> and this process was dieing with OOMs because it could not create any more 
> netty threads.
> Looking at the code it appears that come from two different things.  First 
> The Client code is using it's own thread pool for each Client instead of 
> sharing a thread pool, but also the protocol itself blocks the thread in 
> takeMessages() if there are no messages to send.
> So we need to make the thread pool shared between all of the clients and 
> modify the protocol so that takeMessages does not block.  But with it not 
> blocking we also need a way to have Client.send write directly to the Channel 
> in some situations so that the messages still are sent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to