[ 
https://issues.apache.org/jira/browse/HBASE-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13635693#comment-13635693
 ] 

Enis Soztutar commented on HBASE-8374:
--------------------------------------

Nice bug. Agree that serversToIndex is not populated first. Also it might 
happen that RegionLocationFinder might return region locations that we do not 
know about (the RS might have died, and we could be caching the data, etc). We 
should still guard against serversToIndex.get(loc.get(i)) returning null. 
For the patch, we should not use boxed primitives (for regionLocations = new 
int[numRegions][];). We can use -1, to indicate a null value.
                
> NPE when launching the balance
> ------------------------------
>
>                 Key: HBASE-8374
>                 URL: https://issues.apache.org/jira/browse/HBASE-8374
>             Project: HBase
>          Issue Type: Bug
>          Components: Balancer
>    Affects Versions: 0.95.0
>         Environment: AWS / real cluster with 3 nodes + master
>            Reporter: Nicolas Liochon
>            Assignee: Ted Yu
>             Fix For: 0.98.0, 0.95.1
>
>         Attachments: 8374-trunk.txt, 8374-trunk-v2.txt, 8374-trunk-v3.txt, 
> 8374-trunk-v4.txt
>
>
> I don't reproduce this all the time, but I had it on a fairly clean env.
> It occurs every 5 minutes (i.e. the balancer period). Impact is severe: the 
> balancer does not run. When it starts to occurs, it occurs all the time. I 
> haven't tried to restart the master, but I think it should be enough.
> Now, looking at the code, the NPE is strange. 
> {noformat}
> 2013-04-18 08:09:52,079 ERROR [box,60000,1366281581983-BalancerChore] 
> org.apache.hadoop.hbase.master.balancer.BalancerChore: Caught exception
> java.lang.NullPointerException
>       at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.<init>(BaseLoadBalancer.java:145)
>       at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:194)
>       at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1295)
>       at 
> org.apache.hadoop.hbase.master.balancer.BalancerChore.chore(BalancerChore.java:48)
>       at org.apache.hadoop.hbase.Chore.run(Chore.java:81)
>       at java.lang.Thread.run(Thread.java:662)
> 2013-04-18 08:09:52,103 DEBUG [box,60000,1366281581983-CatalogJanitor] 
> org.apache.hadoop.hbase.client.ClientScanner: Creating scanner over .META. 
> starting at key ''
> {noformat}
> {code}
>           if (regionFinder != null) {
>             //region location
>             List<ServerName> loc = regionFinder.getTopBlockLocations(region);
>             regionLocations[regionIndex] = new int[loc.size()];
>             for (int i=0; i < loc.size(); i++) {
>               regionLocations[regionIndex][i] = 
> serversToIndex.get(loc.get(i));  // <========= NPE here
>             }
>           }
> {code}
> pinging [~enis], just in case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to