[jira] [Commented] (HBASE-16388) Prevent client threads being blocked by only one slow region server

stack (JIRA) Wed, 14 Sep 2016 09:23:56 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-16388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15490850#comment-15490850
 ]


stack commented on HBASE-16388:
-------------------------------

Findbugs issue is unrelated:


Code    Warning
RV      Return value of java.util.concurrent.CountDownLatch.await(long, 
TimeUnit) ignored in 
org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper()

Looks like attempt at fixing this elsewhere did not work.

I tried the tests locally and they passed. Need to dig in on these tests on why 
they are failing. They seem unrelated to this patch. Going to commit.

> Prevent client threads being blocked by only one slow region server
> -------------------------------------------------------------------
>
>                 Key: HBASE-16388
>                 URL: https://issues.apache.org/jira/browse/HBASE-16388
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Phil Yang
>            Assignee: Phil Yang
>         Attachments: HBASE-16388-branch-1-v1.patch, 
> HBASE-16388-branch-1-v2.patch, HBASE-16388-v1.patch, HBASE-16388-v2.patch, 
> HBASE-16388-v2.patch, HBASE-16388-v2.patch, HBASE-16388-v2.patch, 
> HBASE-16388-v3.patch
>
>
> It is a general use case for HBase's users that they have several 
> threads/handlers in their service, and each handler has its own Table/HTable 
> instance. Generally users think each handler is independent and won't 
> interact each other.
> However, in an extreme case, if a region server is very slow, every requests 
> to this RS will timeout, handlers of users' service may be occupied by the 
> long-waiting requests even requests belong to other RS will also be timeout.
> For example: 
> If we have 100 handlers in a client service(timeout is 1000ms) and HBase has 
> 10 region servers whose average response time is 50ms. If no region server is 
> slow, we can handle 2000 requests per second.
> Now this service's QPS is 1000. If there is one region server very slow and 
> all requests to it will be timeout. Users hope that only 10% requests failed, 
> and 90% requests' response time is still 50ms, because only 10% requests are 
> located to the slow RS. However, each second we have 100 long-waiting 
> requests which exactly occupies all 100 handles. So all handlers is blocked, 
> the availability of this service is almost zero.
> To prevent this case, we can limit the max concurrent requests to one RS in 
> process-level. Requests exceeding the limit will throws 
> ServerBusyException(extends DoNotRetryIOE) immediately to users. In the above 
> case, if we set this limit to 20, only 20 handlers will be occupied and other 
> 80 handlers can still handle requests to other RS. The availability of this 
> service is 90% as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16388) Prevent client threads being blocked by only one slow region server

Reply via email to