[
https://issues.apache.org/jira/browse/HBASE-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023302#comment-13023302
]
stack commented on HBASE-3813:
------------------------------
Todd, you can do anything! (J/K). Yes, that sounds good. We have 'blocking'
going on in app already when memstores fill. I'm thinking though that we'd
want to just do crass smaller queues for a 0.90.3 and then a sizing fix for
0.92.0 (We were going to run some tests here on our frontend to make sure no
side effects taking the queue size down).
> Change RPC callQueue size from "handlerCount * MAX_QUEUE_SIZE_PER_HANDLER;"
> ---------------------------------------------------------------------------
>
> Key: HBASE-3813
> URL: https://issues.apache.org/jira/browse/HBASE-3813
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.3
> Reporter: stack
> Priority: Critical
>
> Yesterday debugging w/ Jack we noticed that with few handlers on a big box,
> he was seeing stats like this:
> {code}
> 2011-04-21 11:54:49,451 DEBUG org.apache.hadoop.ipc.HBaseServer: Server
> connection from X.X.X.X:60931; # active connections: 11; # queued calls: 2500
> {code}
> We had 2500 items in the rpc queue waiting to be processed.
> Turns out he had too few handlers for number of clients (but also, it seems
> like he figured hw issues in that his RAM bus was running at 1/4 the rate
> that it should have been running at).
> Chatting w/ J-D this morning, he asked if the queues hold 'data'. The queues
> hold 'Calls'. Calls are the client request. They contain data.
> Jack had 2500 items queued. If each item to insert was 1MB, thats 25k * 1MB
> of memory that is outside of our generally accounting.
> Currently the queue size is handlers * MAX_QUEUE_SIZE_PER_HANDLER where
> MAX_QUEUE_SIZE_PER_HANDLER is hardcoded to be 100.
> If the queue is full we block (LinkedBlockingQueue).
> Going to change the queue size from 100 to 10 by default -- but also will
> make it configurable and will doc. this as possible cause of OOME. Will try
> it on production here before committing patch.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira