[
https://issues.apache.org/jira/browse/HADOOP-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614888#action_12614888
]
Raghu Angadi commented on HADOOP-3779:
--------------------------------------
yes, in fact you may not like like the 256 limitation at all.
In any case, if you just want to close any client connection that is idle (for
say 1 sec), that needs to be handled at the DataNode level and not at
SelectorPool. SelectorPool is an implementation detail of a utility to do
blocking IO with NIO sockets. From your brief description, your suggested fix
does not seem like some thing very useful and is at wrong level (kind of like
writing a kernel module to close an idle socket :) ) . May be a detailed
description or better a simple prototype implementation will make it more clear.
Note that we need to rewrite data transfer code paths in DataNode to do real
async transfer (network transfers are easy, but datanode needs to do disk I/O).
I would sooner or later DataNode needs to do that.. it can not continue to live
with one thread per connection.
I am thinking of proposing a design for "async data transfers" if there is
enough interest. Basic idea is to share a pool of threads (we need a pool to
do disk I/O) to handle all the clients transfers.. something like 5 or so per
disk. This requires substantial rewrite of readBlock() and writeBlock() code
paths in Datanode.
> limit concurrent connections(data serving thread) in one datanode
> -----------------------------------------------------------------
>
> Key: HADOOP-3779
> URL: https://issues.apache.org/jira/browse/HADOOP-3779
> Project: Hadoop Core
> Issue Type: Improvement
> Components: dfs
> Affects Versions: 0.17.1
> Reporter: LN
> Priority: Minor
>
> i'm here after HADOOP-2341 and HADOOP-2346, in my hbase env, many opening
> mapfiles cause datanode OOME(stack memory), because 2000+ data serving
> threads in datanode process.
> although HADOOP-2346 has implements timeouts, it will be some situation many
> connection created before the read timeout(default 6min) reach. like hbase
> does, it open all files on regionserver startup.
> limit concurrent connections(data serving thread) will make datanode more
> stable. and i think it could be done in
> SocketIOWithTimeout$SelectorPool#select:
> 1. in SelectorPool#select, record all waiting SelectorInfo instances in a
> List at the beginning, and remove it after 'Selector#select' done.
> 2. before real 'select', do a limitation check, if reached, close the first
> selectorInfo.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.