[
https://issues.apache.org/jira/browse/HBASE-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662736#action_12662736
]
Luo Ning commented on HBASE-24:
-------------------------------
my understands:
xceiverCount must be set to avoiding OOME, with -Xss option and memory size a
java process can use, it can be set more larger than 256, i'm using 2500 in my
DN.
how may concurrent open mapfiles in hbase, how may xceiver thread in hadoop
DNs.
if set write.timeout to 0, xceiverCount will reach soon, as hbase data
increasing, then DN hangs.
set write.timeout to 8 min, only useful in following situation: hbase never
read/write more the 'xceiverCount' mapfiles in the 8 min, otherwise the DN
will stop serving data.
the chains is: more data -> more mapfiles ->more opening mapfiles -> more
xceiver threads, comprare all solutions we can have in the chain, controlling
opening mapfiles is the most efficient.
my hbase installation: 2 regionservers/namenodes, each handle about 500 region,
250G data. concurrent open mapfiles set to 2000, and xceiverCount should larger
than it, so set to 2500.
> Scaling: Too many open file handles to datanodes
> ------------------------------------------------
>
> Key: HBASE-24
> URL: https://issues.apache.org/jira/browse/HBASE-24
> Project: Hadoop HBase
> Issue Type: Bug
> Components: regionserver
> Reporter: stack
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: HBASE-823.patch, MonitoredReader.java
>
>
> We've been here before (HADOOP-2341).
> Today the rapleaf gave me an lsof listing from a regionserver. Had thousands
> of open sockets to datanodes all in ESTABLISHED and CLOSE_WAIT state. On
> average they seem to have about ten file descriptors/sockets open per region
> (They have 3 column families IIRC. Per family, can have between 1-5 or so
> mapfiles open per family -- 3 is max... but compacting we open a new one,
> etc.).
> They have thousands of regions. 400 regions -- ~100G, which is not that
> much -- takes about 4k open file handles.
> If they want a regionserver to server a decent disk worths -- 300-400G --
> then thats maybe 1600 regions... 16k file handles. If more than just 3
> column families..... then we are in danger of blowing out limits if they are
> 32k.
> We've been here before with HADOOP-2341.
> A dfsclient that used non-blocking i/o would help applications like hbase
> (The datanode doesn't have this problem as bad -- CLOSE_WAIT on regionserver
> side, the bulk of the open fds in the rapleaf log, don't have a corresponding
> open resource on datanode end).
> Could also just open mapfiles as needed, but that'd kill our random read
> performance and its bad enough already.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.