[ 
https://issues.apache.org/jira/browse/HBASE-22739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895119#comment-16895119
 ] 

Reid Chan commented on HBASE-22739:
-----------------------------------

Spent some times to understand the issue.
This exception indicates the regionIndex is out of sync. The caller is 
{code}
removeRegion(regionsPerServer[mra.fromServer], mra.region);
{code}
If it throws AIOOB, it means region is not found in specific server after whole 
iteration:
{code}
      int i = 0;
      for (i = 0; i < regions.length; i++) {
        if (regions[i] == regionIndex) {
          break;
        }
        // when i is the last index of region which is greater than newRegions 
by one, here will throw AIOOB.
        newRegions[i] = regions[i];
      }
      System.arraycopy(regions, i+1, newRegions, i, newRegions.length - i);
{code}

It is quite hard to track a region flow and dig out the root cause, right now.
Ping [~anoop.hbase], WDYT.

> ArrayIndexOutOfBoundsException when balance
> -------------------------------------------
>
>                 Key: HBASE-22739
>                 URL: https://issues.apache.org/jira/browse/HBASE-22739
>             Project: HBase
>          Issue Type: Bug
>          Components: Balancer
>            Reporter: casuallc
>            Priority: Major
>             Fix For: 2.1.1
>
>
>  
> {code:java}
> 2019-07-25 15:19:59,828 ERROR [master/nna:16000.Chore.1] 
> hbase.ScheduledChore: Caught error
> java.lang.ArrayIndexOutOfBoundsException: 3171
> at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.removeRegion(BaseLoadBalancer.java:873)
> at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.doAction(BaseLoadBalancer.java:716)
> at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:407)
> at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:318)
> at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1650)
> at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1567)
> at 
> org.apache.hadoop.hbase.master.balancer.BalancerChore.chore(BalancerChore.java:49)
> at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> should check if the regionIndex is valid when removeRegion,
> java: 
> hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
> {code:java}
> int[] removeRegion(int[] regions, int regionIndex) {
>     //TODO: this maybe costly. Consider using linked lists
>     int[] newRegions = new int[regions.length - 1];
>     int i = 0;
>     for (i = 0; i < regions.length; i++) {
>         if (regions[i] == regionIndex) {
>             break;
>         }
>         if (i == regions.length - 1) {
>             return Arrays.copyOf(regions, regions.length);
>         }
>         newRegions[i] = regions[i];
>     }
>     System.arraycopy(regions, i+1, newRegions, i, newRegions.length - i);
>     return newRegions;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to