[
https://issues.apache.org/jira/browse/HBASE-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13692696#comment-13692696
]
Chao Shi commented on HBASE-8801:
---------------------------------
Here is our proposed solutions (this is done by [~xingren23] [~boneylw] and
me). We are looking forward for advice from you experts.
h4. a) client-side blacklist
A very naive solution is to introduce znodes: /hbase/unhealth/<region>. Once a
region is set to read-only or write-only, we write something to the znode.
Clients watch on /hbase/unhealth. Whenever get notified, the client start to
reject read or write requests accordingly.
This change should be simple as it does not touch anything at RS side. One
shorthand is that it does not handle split easily: when a region splits, the
child regions lose the blacklist settings.
[~boneylw] has made a prototype for approach a).
h4. b) server-side blacklist
Doing that at server-side is slight more complicated. We need to introduce a
field in HRegionInfo in META table, indicating whether a region is allowed for
read/write.
1. A user calls RS to set a region’s read/write-only status.
2. The RS then updates the META table as well as its memory data structure.
3. From now on, new read or write requests to that region will be rejected with
an exception (a subclass of NotServingRegionException). This could be done in
HRegion#startRegionOperation, similar to HBASE-7006 does.
4. When the client receives NotServingRegionException, it rejects requests at
client side. The current implementation talks to META table every retry. We
need to cache the location for a short time (e.g. ~10s to 1min), preventing
touching META table too frequently.
On split, child regions inherit the blacklist settings.
If we get timeout exception between 2 and 3, we don’t know what HRegionInfo in
META will be. It is possible to get exception here but the mutation eventually
get in. The risk is that RS think of the region is inconsistent with META
table. We think it is okay because it is a tool and supposed to be executed
manually. (Alternatively, we could abort RS just as split transaction does, but
it is an over-kill).
We have thought of using ZK rather than META table. We prefer META table as it
is a central place for storing region states.
> Region level read/write degradation
> -----------------------------------
>
> Key: HBASE-8801
> URL: https://issues.apache.org/jira/browse/HBASE-8801
> Project: HBase
> Issue Type: Bug
> Reporter: Chao Shi
>
> We would like to propose a tool for HBase administrators to disable read
> and/or write for single region temporarily.
> Our HBase table at weibo.com is accessed by many clients. We have experienced
> the following outage: due to bugs of user code, a small number of regions are
> reading at very high rate. Requests to other regions at the same RSs are
> affected. In such scenario, the admin would like to temporarily set the
> region to write-only (only forbids reads so writes can still being served).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira