[
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15675588#comment-15675588
]
Yu Li commented on HBASE-17114:
-------------------------------
bq. Another approach to this would be to allow the server to hint back to the
client how long it should back off
I guess the above statement about "back off" is the back off policy instead of
the exponential backoff array? So I checked the default value of
{{ClientBackoffPolicy}}, or could you please explain how to make server hint
back? [~ghelmling]
bq. If you want to make this overridable for some exception types, that seems
ok, but in that case the config property for overriding the value should be
more closely tied to the exception.
Well, if checking the uploaded patch, it's indeed tied to CQTBE only.
Introducing a new property is only for making things more flexible, and of
course we could use a hard-coded, like 5 times than the existing pause, for
CQTBE. But I'd say this is a trade-off, waiting longer for CQTBE could prevent
the vicious circle but is also causing a higher latency, and IMHO user should
be able to control such trade-off. If they don't want CQTBE to be special, they
could set {{hbase.client.pause.special}} to the same value as
{{hbase.client.pause}}, which gives them more options.
No offense but I'm even thinking of making CQTBE thrown optional, because for
some case dead-wait for the request to be executed in RpcServer until time-out
is preferable by user rather than receiving some exception and retry and fail
again, but obviously this is another topic (Smile).
bq. It's only special in the sense that it should not clear the client meta
cache
Sorry but I don't see any difference in "should not clear the client meta
cache" and "should not retry so frequently", both trying to resolve some
problem and make things better.
OTOH, we already have the {{RetryImmediatelyException}} just because in some
case retry w/o waiting is good, then why retry slower is not acceptable? Now
that the retry pause already split into immediately and wait, I think it's ok
to further split the wait case into quick and slow, wdyt?
Thanks.
> Add an option to set special retry pause when encountering
> CallQueueTooBigException
> -----------------------------------------------------------------------------------
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
> Issue Type: Bug
> Reporter: Yu Li
> Assignee: Yu Li
> Attachments: HBASE-17114.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}}
> instead of dead-wait. This is good for performance for most cases but might
> cause a side-effect that if too many clients connect to the busy RS, that the
> retry requests may come over and over again and RS never got the chance for
> recovering, and the issue will become especially critical when the target
> region is META.
> So here in this JIRA we propose to supply some special retry pause for CQTBE
> in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5
> times of {{hbase.client.pause}} default value)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)