[ https://issues.apache.org/jira/browse/HADOOP-2577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558296#action_12558296 ]
stack commented on HADOOP-2577: ------------------------------- We could do that or make it optional behavior until we figure a fix. FYI, to do a cold-start random read into a MapFile, you need to open two files (the data file and its index), read all of the index into memory, find the closest offset in the index and then seek around in the data file to find the asked-for key. In hbase currently, only the data file is held open (the index file read into memory has been forced and the index file has then been let go). > [hbase] Scaling: Too many open file handles to datanodes > -------------------------------------------------------- > > Key: HADOOP-2577 > URL: https://issues.apache.org/jira/browse/HADOOP-2577 > Project: Hadoop > Issue Type: Bug > Components: contrib/hbase > Reporter: stack > > We've been here before (HADOOP-2341). > Today the rapleaf gave me an lsof listing from a regionserver. Had thousands > of open sockets to datanodes all in ESTABLISHED and CLOSE_WAIT state. On > average they seem to have about ten file descriptors/sockets open per region > (They have 3 column families IIRC. Per family, can have between 1-5 or so > mapfiles open per family -- 3 is max... but compacting we open a new one, > etc.). > They have thousands of regions. 400 regions -- ~100G, which is not that > much -- takes about 4k open file handles. > If they want a regionserver to server a decent disk worths -- 300-400G -- > then thats maybe 1600 regions... 16k file handles. If more than just 3 > column families..... then we are in danger of blowing out limits if they are > 32k. > We've been here before with HADOOP-2341. > A dfsclient that used non-blocking i/o would help applications like hbase > (The datanode doesn't have this problem as bad -- CLOSE_WAIT on regionserver > side, the bulk of the open fds in the rapleaf log, don't have a corresponding > open resource on datanode end). > Could also just open mapfiles as needed, but that'd kill our random read > performance and its bad enough already. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.