[
https://issues.apache.org/jira/browse/STORM-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Joseph Evans reassigned STORM-12:
----------------------------------------
Assignee: Robert Joseph Evans
> Reduce Thread Usage of Netty Transport
> --------------------------------------
>
> Key: STORM-12
> URL: https://issues.apache.org/jira/browse/STORM-12
> Project: Apache Storm (Incubating)
> Issue Type: Improvement
> Reporter: Robert Joseph Evans
> Assignee: Robert Joseph Evans
> Fix For: 0.9.2-incubating
>
>
> When users start to create large topologies the storm netty messaging layer
> uses lots of threads. This has resulted in OOMs because the default ulimit
> on most linux distros is around 4000 processes. It looks like the messaging
> layer wants to have one thread per server it is connected to, so that means
> the total number of other workers in the System.
> For one particular case we saw.
> 1 (Curator delay thread)
> 1 (Curator Event Processor)
> 1 (Finalizer)
> 1 (GC???)
> 1 (Storm messaging recv thread asking netty for messages)
> 1 (Thread pool polling on a Synchronous queue???)
> 1 (ZK Connection)
> 1 (ZK epoll)
> 2 (???)
> 2 (Netty epoll)
> 6 (Timer Thread)
> 15 (Disruptor consume batches)
> 104 (Netty Thread pool taking messages to be sent)
> and this process was dieing with OOMs because it could not create any more
> netty threads.
> Looking at the code it appears that come from two different things. First
> The Client code is using it's own thread pool for each Client instead of
> sharing a thread pool, but also the protocol itself blocks the thread in
> takeMessages() if there are no messages to send.
> So we need to make the thread pool shared between all of the clients and
> modify the protocol so that takeMessages does not block. But with it not
> blocking we also need a way to have Client.send write directly to the Channel
> in some situations so that the messages still are sent.
--
This message was sent by Atlassian JIRA
(v6.2#6252)