[ 
https://issues.apache.org/jira/browse/HBASE-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13692696#comment-13692696
 ] 

Chao Shi commented on HBASE-8801:
---------------------------------

Here is our proposed solutions (this is done by [~xingren23] [~boneylw] and 
me). We are looking forward for advice from you experts.

h4. a) client-side blacklist
A very naive solution is to introduce znodes: /hbase/unhealth/<region>. Once a 
region is set to read-only or write-only, we write something to the znode. 
Clients watch on /hbase/unhealth. Whenever get notified, the client start to 
reject read or write requests accordingly.

This change should be simple as it does not touch anything at RS side. One 
shorthand is that it does not handle split easily: when a region splits, the 
child regions lose the blacklist settings.

[~boneylw] has made a prototype for approach a).

h4. b) server-side blacklist
Doing that at server-side is slight more complicated. We need to introduce a 
field in HRegionInfo in META table, indicating whether a region is allowed for 
read/write.

1. A user calls RS to set a region’s read/write-only status.
2. The RS then updates the META table as well as its memory data structure. 
3. From now on, new read or write requests to that region will be rejected with 
an exception (a subclass of NotServingRegionException). This could be done in 
HRegion#startRegionOperation, similar to HBASE-7006 does.
4. When the client receives NotServingRegionException, it rejects requests at 
client side. The current implementation talks to META table every retry. We 
need to cache the location for a short time (e.g. ~10s to 1min), preventing 
touching META table too frequently.

On split, child regions inherit the blacklist settings.

If we get timeout exception between 2 and 3, we don’t know what HRegionInfo in 
META will be. It is possible to get exception here but the mutation eventually 
get in. The risk is that RS think of the region is inconsistent with META 
table. We think it is okay because it is a tool and supposed to be executed 
manually. (Alternatively, we could abort RS just as split transaction does, but 
it is an over-kill).

We have thought of using ZK rather than META table. We prefer META table as it 
is a central place for storing region states.
                
> Region level read/write degradation
> -----------------------------------
>
>                 Key: HBASE-8801
>                 URL: https://issues.apache.org/jira/browse/HBASE-8801
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Chao Shi
>
> We would like to propose a tool for HBase administrators to disable read 
> and/or write for single region temporarily.
> Our HBase table at weibo.com is accessed by many clients. We have experienced 
> the following outage: due to bugs of user code, a small number of regions are 
> reading at very high rate. Requests to other regions at the same RSs are 
> affected. In such scenario, the admin would like to temporarily set the 
> region to write-only (only forbids reads so writes can still being served).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to