[ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677228#comment-15677228
 ] 

Gary Helmling commented on HBASE-17114:
---------------------------------------

-1 to the current patch:

* by default, retries of CQTBE should use the value from hbase.client.pause.  
Changing this to use a different config value by default changes behavior 
unexpectedly for _all_ users.  For the average HBase user, if you've already 
tuned hbase.client.pause and suddenly find some requests pausing longer than 
others due to this change, this is a poor experience.
* hbase.client.pause.special does not describe what this actually configures.  
Rename it to hbase.client.pause.callqueuetoobigexception and add it, with no 
default value, but with a description, to hbase-default.xml.  This needs to be 
clearly documented.
* only if hbase.client.pause.callqueuetoobigexception is set should you use 
this as a "special" pause for CQTBE, otherwise use hbase.client.pause.  This 
allows you to configure what you need in your environment without impacting all 
other HBase users.
* the added test case looks like it will be extremely sensitive to timing in 
the test environment and will likely be flaky on slow or overloaded machines.  
I think it would be better to simply test the calculated pause time for various 
configs + exceptions instead of trying to do an end to end test of the actual 
sleep time.

> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-17114
>                 URL: https://issues.apache.org/jira/browse/HBASE-17114
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Yu Li
>            Assignee: Yu Li
>         Attachments: HBASE-17114.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to supply some special retry pause for CQTBE 
> in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 
> times of {{hbase.client.pause}} default value)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to