[ 
https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14309:
---------------------------
    Description: 
This issue adds boolean parameter, force, to 'balancer' command so that admin 
can force region balancing even when there is region in transition - assuming 
RIT being transient.

This enhancement was requested by some customer.

The assumption of this change is that the operator has run hbck and has a 
reasonable idea why regions are stuck in transition before using the force flag.

There was a recent event at the customer where a cluster ended up with a small 
number of regionservers hosting most of the regions on the cluster (one 
regionserver had 50% of the roughly 20,000 regions). The balancer couldn't be 
run due to the small number of regions that were stuck in transition. The admin 
ended up killing the regionservers so that reassignment would yield a more 
equitable distribution of the regions.

On a different cluster, there was a single store file that had corrupt HDFS 
blocks (the SSDs on the cluster were known to lose data). However, since this 
single region (out of 10s of 1000s of regions on this cluster) was stuck in 
transition, the balancer couldn't run.
While the state keeping in HBase isn't so good yet that the admin can kick off 
the balancer automatically in such scenarios knowing when it is safe to do so 
and when it is not, having this option available for the operator to use as he 
/ she sees fit seems prudent.

  was:
This issue adds boolean parameter, force, to 'balancer' command so that admin 
can force region balancing even when there is region in transition - assuming 
RIT being transient.

This enhancement was requested by some customer.


> Allow load balancer to operate when there is region in transition by adding 
> force flag
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-14309
>                 URL: https://issues.apache.org/jira/browse/HBASE-14309
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>             Fix For: 2.0.0, 1.3.0
>
>         Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 
> 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt, 
> 14309-v5.txt, 14309-v6.txt
>
>
> This issue adds boolean parameter, force, to 'balancer' command so that admin 
> can force region balancing even when there is region in transition - assuming 
> RIT being transient.
> This enhancement was requested by some customer.
> The assumption of this change is that the operator has run hbck and has a 
> reasonable idea why regions are stuck in transition before using the force 
> flag.
> There was a recent event at the customer where a cluster ended up with a 
> small number of regionservers hosting most of the regions on the cluster (one 
> regionserver had 50% of the roughly 20,000 regions). The balancer couldn't be 
> run due to the small number of regions that were stuck in transition. The 
> admin ended up killing the regionservers so that reassignment would yield a 
> more equitable distribution of the regions.
> On a different cluster, there was a single store file that had corrupt HDFS 
> blocks (the SSDs on the cluster were known to lose data). However, since this 
> single region (out of 10s of 1000s of regions on this cluster) was stuck in 
> transition, the balancer couldn't run.
> While the state keeping in HBase isn't so good yet that the admin can kick 
> off the balancer automatically in such scenarios knowing when it is safe to 
> do so and when it is not, having this option available for the operator to 
> use as he / she sees fit seems prudent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to