[ 
https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720293#comment-14720293
 ] 

Jerry He commented on HBASE-14309:
----------------------------------

+1 on the WARNING and guard on meta.
force is commonly always last resort :-)

{code}
-        LOG.debug("Not running balancer because " + regionsInTransition.size() 
+
+        // if hbase:meta region is in transition, result of assignment cannot 
be recorded
+        // ignore the force flag in that case
+        String prefix = force && 
!assignmentManager.getRegionStates().isMetaRegionInTransition() ?
+            "R" : "Not r";
+        LOG.debug(prefix + "unning balancer because " + 
regionsInTransition.size() +
           " region(s) in transition: " + org.apache.commons.lang.StringUtils.
             abbreviate(regionsInTransition.toString(), 256));
-        return false;
+        if (!force) return false;
{code}
Should return false if isMetaRegionInTransition is true.



> Allow load balancer to operate when there is region in transition by adding 
> force flag
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-14309
>                 URL: https://issues.apache.org/jira/browse/HBASE-14309
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>             Fix For: 2.0.0, 1.3.0
>
>         Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 
> 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt, 
> 14309-v5.txt, 14309-v6.txt, 14309-v7.txt
>
>
> This issue adds boolean parameter, force, to 'balancer' command so that admin 
> can force region balancing even when there is region in transition - assuming 
> RIT being transient.
> This enhancement was requested by some customer.
> The assumption of this change is that the operator has run hbck and has a 
> reasonable idea why regions are stuck in transition before using the force 
> flag.
> There was a recent event at the customer where a cluster ended up with a 
> small number of regionservers hosting most of the regions on the cluster (one 
> regionserver had 50% of the roughly 20,000 regions). The balancer couldn't be 
> run due to the small number of regions that were stuck in transition. The 
> admin ended up killing the regionservers so that reassignment would yield a 
> more equitable distribution of the regions.
> On a different cluster, there was a single store file that had corrupt HDFS 
> blocks (the SSDs on the cluster were known to lose data). However, since this 
> single region (out of 10s of 1000s of regions on this cluster) was stuck in 
> transition, the balancer couldn't run.
> While the state keeping in HBase isn't so good yet that the admin can kick 
> off the balancer automatically in such scenarios knowing when it is safe to 
> do so and when it is not, having this option available for the operator to 
> use as he / she sees fit seems prudent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to