[
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677228#comment-15677228
]
Gary Helmling commented on HBASE-17114:
---------------------------------------
-1 to the current patch:
* by default, retries of CQTBE should use the value from hbase.client.pause.
Changing this to use a different config value by default changes behavior
unexpectedly for _all_ users. For the average HBase user, if you've already
tuned hbase.client.pause and suddenly find some requests pausing longer than
others due to this change, this is a poor experience.
* hbase.client.pause.special does not describe what this actually configures.
Rename it to hbase.client.pause.callqueuetoobigexception and add it, with no
default value, but with a description, to hbase-default.xml. This needs to be
clearly documented.
* only if hbase.client.pause.callqueuetoobigexception is set should you use
this as a "special" pause for CQTBE, otherwise use hbase.client.pause. This
allows you to configure what you need in your environment without impacting all
other HBase users.
* the added test case looks like it will be extremely sensitive to timing in
the test environment and will likely be flaky on slow or overloaded machines.
I think it would be better to simply test the calculated pause time for various
configs + exceptions instead of trying to do an end to end test of the actual
sleep time.
> Add an option to set special retry pause when encountering
> CallQueueTooBigException
> -----------------------------------------------------------------------------------
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
> Issue Type: Bug
> Reporter: Yu Li
> Assignee: Yu Li
> Attachments: HBASE-17114.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}}
> instead of dead-wait. This is good for performance for most cases but might
> cause a side-effect that if too many clients connect to the busy RS, that the
> retry requests may come over and over again and RS never got the chance for
> recovering, and the issue will become especially critical when the target
> region is META.
> So here in this JIRA we propose to supply some special retry pause for CQTBE
> in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5
> times of {{hbase.client.pause}} default value)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)