[
https://issues.apache.org/jira/browse/HBASE-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15708960#comment-15708960
]
Hudson commented on HBASE-17178:
--------------------------------
SUCCESS: Integrated in Jenkins build HBase-Trunk_matrix #2047 (See
[https://builds.apache.org/job/HBase-Trunk_matrix/2047/])
HBASE-17178 Add region balance throttling (yangzhe1991: rev
ea912478e28f1f041a6058c09a9d9eab64e01352)
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* (edit)
hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
* (edit)
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BalancerChore.java
* (add)
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterBalanceThrottling.java
* (edit)
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java
* (edit) hbase-common/src/main/resources/hbase-default.xml
* (edit) hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
> Add region balance throttling
> -----------------------------
>
> Key: HBASE-17178
> URL: https://issues.apache.org/jira/browse/HBASE-17178
> Project: HBase
> Issue Type: Improvement
> Components: Balancer
> Affects Versions: 2.0.0, 1.4.0
> Reporter: Guanghao Zhang
> Assignee: Guanghao Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17178-branch-1-v1.patch,
> HBASE-17178-branch-1.patch, HBASE-17178-v1.patch, HBASE-17178-v2.patch,
> HBASE-17178-v3.patch, HBASE-17178-v4.patch, HBASE-17178-v5.patch,
> HBASE-17178-v6.patch
>
>
> Our online cluster serves dozens of tables and different tables serve for
> different services. If the balancer moves too many regions in the same time,
> it will decrease the availability for some table or some services. So we add
> region balance throttling on our online serve cluster.
> We introduce a new config hbase.balancer.max.balancing.regions, which means
> the max number of regions in transition when balancing.
> If we config this to 1 and a table have 100 regions, then the table will have
> 99 regions available at any time. It helps a lot for our use case and it has
> been running a long time
> our production cluster.
> But for some use case, we need the balancer run faster. If a cluster has 100
> regionservers, then it add 50 new regionservers for peak requests. Then it
> need balancer run as soon as
> possible and let the cluster reach a balance state soon. Our idea is compute
> max number of regions in transition by the max balancing time and the average
> time of region in transition.
> Then the balancer use the computed value to throttling.
> Examples for understanding.
> A cluster has 100 regionservers, each regionserver has 200 regions and the
> average time of region in transition is 1 seconds, we config the max
> balancing time is 10 * 60 seconds.
> Case 1. One regionserver crash, the cluster at most need balance 200 regions.
> Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in
> transition is 1 when balancing. Then the balancer can move region one by one
> and the cluster will have high availability when balancing.
> Case 2. Add other 100 regionservers, the cluster at most need balance 10000
> regions. Then 10000 / (10 * 60s / 1s) = 16.7, it means the max number of
> regions in transition is 17 when balancing. Then the cluster can reach a
> balance state within the max balancing time.
> Any suggestions are welcomed.
> Review board: https://reviews.apache.org/r/54191/
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)