[jira] [Commented] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

Yu Li (JIRA) Fri, 18 Nov 2016 10:30:25 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677377#comment-15677377
 ]


Yu Li commented on HBASE-17114:
-------------------------------

bq. For advanced users who really need to treat CQTBE differently, that should 
be possible by means of an override, but should not be forced on everyone.
Agreed.

bq. I'm all for improving the client/server interactions in these scenarios, 
and what I first outlined in this issue was one idea for how to do that more 
effectively. However, I would also like us to avoid unexpected surprises for 
our users, and regressions in server behavior.
Yep, this is indeed a surprise for us since there wasn't any CQTBE thus no 
special handling in client side codes, and user kept complaining about "what's 
CQTBE and why it's happening when never before"...

One thing to clarify is that I didn't mean to deny the advantage of introducing 
CQTBE, and backporting and using it in our 1.1.2 is a proof (smile). My concern 
lies in the server side behavior change just like you mentioned. I think more 
document in ref guide would help for users upgrading from an old version w/o 
CQTBE.

bq. I'm not sure of the exact symptoms you're trying to solve, but if you're 
seeing issues with meta being overloaded, then I'd suggest tuning the 
configuration for the number of priority handlers and size of the priority 
queues.
Actually we did, we moved meta to an exclusive machine (no other regions on it) 
and increased priority handlers to 128 (and I'm afraid HBASE-15470 only goes 
into branch-1.3 and priority queue not controllable before) but still observed 
a high load, and that's why we further introduce the patch here.

bq. You could also evaluate running with meta hosted on master, which together 
with zk-less assignment can make region assignment much more stable.
This feature is also not available before branch-1.3 I'm afraid, and because 
currently master is light-weight and we could hot-switch it to apply some 
hot-fix, we may also don't want master to carry meta in future.



> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-17114
>                 URL: https://issues.apache.org/jira/browse/HBASE-17114
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Yu Li
>            Assignee: Yu Li
>         Attachments: HBASE-17114.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to supply some special retry pause for CQTBE 
> in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 
> times of {{hbase.client.pause}} default value)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

Reply via email to