[
https://issues.apache.org/jira/browse/HADOOP-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13765929#comment-13765929
]
Daryn Sharp commented on HADOOP-9956:
-------------------------------------
I've considered a number of ways to avoid running out of fds. Rather than
throttle accepts, or reject connections, I think the best strategy is to let
the engine red-line but don't let it blow up. Once a limit (perhaps based on
getrlimit minus some number) is hit, the listener will wait for sockets to
close or idle out before accepting more.
I'd like to suggest another subtask to handle running out of fds since it's a
pre-existing issue.
> RPC listener inefficiently assigns connections to readers
> ---------------------------------------------------------
>
> Key: HADOOP-9956
> URL: https://issues.apache.org/jira/browse/HADOOP-9956
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: ipc
> Affects Versions: 2.0.0-alpha, 3.0.0
> Reporter: Daryn Sharp
> Assignee: Daryn Sharp
> Attachments: HADOOP-9956.patch
>
>
> The socket listener and readers use a complex synchronization to update the
> reader's NIO {{Selector}}. Updating active selectors is not thread-safe so
> precautions are required.
> However, the current locking choreography results in a serialized
> distribution of new connections to the parallel socket readers. A
> slower/busier reader can stall the listener and throttle performance.
> The problem manifests as unexpectedly low cpu utilization by the listener and
> readers (~20-30%) under heavy load. The call queue is shallow when it should
> be overflowing.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira