[ 
https://issues.apache.org/jira/browse/HBASE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-20936:
----------------------------
    Comment: was deleted

(was: 
[https://github.com/apache/hbase/blob/067388bfd9dccdb42fcbeedc45a3862555603892/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L417]
 

The undoAction is not implemented. 
[https://github.com/apache/hbase/blob/067388bfd9dccdb42fcbeedc45a3862555603892/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java#L643]

 

So "regionsPerServer" can be changed if StochasticLoadBalancer tried some 
actions before which modified the regionsPerServer. 

!/jira/secure/viewavatar?size=xsmall&avatarId=21133&avatarType=issuetype|width=16,height=16!
  HBASE-19263 is also caused by StochasticLoadBalancer. 

So my fix is to check if the region is still on the server or not to avoid NPE.

 )

> BaseLoadBalancer throws ArrayIndexOutOfBoundsException
> ------------------------------------------------------
>
>                 Key: HBASE-20936
>                 URL: https://issues.apache.org/jira/browse/HBASE-20936
>             Project: HBase
>          Issue Type: Bug
>          Components: Balancer
>    Affects Versions: 2.0.1
>            Reporter: stack
>            Priority: Major
>         Attachments: HBASE-20936.master.001.patch
>
>
> I see the below in a test cluster. The balancer code has been in place a long 
> time. Not sure why its doing the below. Will look later. Meantime, making a 
> record of it.
> {code}
> 2018-07-24 22:03:43,261 INFO  [master/ve0524:16000.Chore.1] 
> balancer.StochasticLoadBalancer: start StochasticLoadBalancer.balancer, 
> initCost=280.47655219381977, functionCost=RegionCountSkewCostFunction : 
> (500.0, 0.4992354740061162); PrimaryRegionCountSkewCostFunction : (500.0, 
> 0.0); MoveCostFunction : (7.0, 0.0); ServerLocalityCostFunction : (25.0, 
> 0.5961002123692674); RackLocalityCostFunction : (15.0, 0.0); 
> TableSkewCostFunction : (35.0, 0.3660578386605784); 
> RegionReplicaHostCostFunction : (100000.0, 0.0); 
> RegionReplicaRackCostFunction : (10000.0, 0.0); ReadRequestCostFunction : 
> (5.0, 0.0); WriteRequestCostFunction : (5.0, 0.0); MemStoreSizeCostFunction : 
> (5.0, 0.0); StoreFileCostFunction : (5.0, 0.6288571056819432);
> 2018-07-24 22:03:43,322 ERROR [master/ve0524:16000.Chore.1] 
> hbase.ScheduledChore: Caught error
> java.lang.ArrayIndexOutOfBoundsException: 64
>         at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.removeRegion(BaseLoadBalancer.java:872)
>         at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.regionMoved(BaseLoadBalancer.java:827)
>         at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.doAction(BaseLoadBalancer.java:723)
>         at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:399)
>         at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:318)
>         at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1514)
>         at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1445)
>         at 
> org.apache.hadoop.hbase.master.balancer.BalancerChore.chore(BalancerChore.java:49)
>         at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 2018-07-24 22:08:20,101 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=4.61 MB, freeSize=6.14 GB, max=6.14 GB, 
> blockCount=0, accesses=0, hits=0, hitRatio=0, cachingAccesses=0, 
> cachingHits=0, cachingHitsRatio=0,evictions=119, evicted=0, evictedPerRun=0.0
> 2018-07-24 22:08:38,101 INFO  [master/ve0524:16000.Chore.1] 
> balancer.StochasticLoadBalancer: start StochasticLoadBalancer.balancer, 
> initCost=280.4765419610035, functionCost=RegionCountSkewCostFunction : 
> (500.0, 0.4992354740061162); PrimaryRegionCountSkewCostFunction : (500.0, 
> 0.0); MoveCostFunction : (7.0, 0.0); ServerLocalityCostFunction : (25.0, 
> 0.5961002123692674); RackLocalityCostFunction : (15.0, 0.0); 
> TableSkewCostFunction : (35.0, 0.3660578386605784); 
> RegionReplicaHostCostFunction : (100000.0, 0.0); 
> RegionReplicaRackCostFunction : (10000.0, 0.0); ReadRequestCostFunction : 
> (5.0, 0.0); WriteRequestCostFunction : (5.0, 0.0); MemStoreSizeCostFunction : 
> (5.0, 0.0); StoreFileCostFunction : (5.0, 0.6288550591186913);
> 2018-07-24 22:08:38,103 ERROR [master/ve0524:16000.Chore.1] 
> hbase.ScheduledChore: Caught error
> java.lang.ArrayIndexOutOfBoundsException: 100
>         at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.removeRegion(BaseLoadBalancer.java:872)
>         at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.regionMoved(BaseLoadBalancer.java:827)
>         at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.doAction(BaseLoadBalancer.java:723)
>         at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:399)
>         at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:318)
>         at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1514)
>         at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1445)
>         at 
> org.apache.hadoop.hbase.master.balancer.BalancerChore.chore(BalancerChore.java:49)
>         at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to