[ 
https://issues.apache.org/jira/browse/HDFS-374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HDFS-374.
-----------------------------------

    Resolution: Fixed

I'm going to resolve this as stale.  There is a good chance this issue might 
still exist but isn't nearly the concern it once was.  If so, please open a new 
jira. 

> HDFS needs to support a very large number of open files.
> --------------------------------------------------------
>
>                 Key: HDFS-374
>                 URL: https://issues.apache.org/jira/browse/HDFS-374
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Jim Kellerman
>
> Currently, DFSClient maintains one socket per open file. For most map/reduce 
> operations, this is not a problem because there just aren't many open files.
> However, HBase has a very different usage model in which a single region 
> region server could have thousands (10**3 but less than 10**4) open files. 
> This can cause both datanodes and region servers to run out of file handles.
> What I would like to see is one connection for each dfsClient, datanode pair. 
> This would reduce the number of connections to hundreds or tens of sockets.
> The intent is not to process requests totally asychronously (overlapping 
> block reads and forcing the client to reassemble a whole message out of a 
> bunch of fragments), but rather to queue requests from the client to the 
> datanode and process them serially, differing from the current implementation 
> in that rather than use an exclusive socket for each file, only one socket is 
> in use between the client and a particular datanode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to