Nihal Jain created HBASE-28913: ---------------------------------- Summary: LoadBalancerPerformanceEvaluation fails with NPE Key: HBASE-28913 URL: https://issues.apache.org/jira/browse/HBASE-28913 Project: HBase Issue Type: Task Reporter: Nihal Jain Assignee: Nihal Jain
While testing [https://github.com/apache/hbase/pull/6258] found that LoadBalancerPerformanceEvaluation fails with NPE, not related to PR as fails for master as well. This Jira is to track and fix this issue. {code:java} bin % ./hbase org.apache.hadoop.hbase.master.balancer.LoadBalancerPerformanceEvaluation -regions 30000 -servers 10 2024-10-14T01:48:06,772 INFO [main {}] metrics.MetricRegistries: Loaded MetricRegistries class org.apache.hadoop.hbase.metrics.impl.MetricRegistriesImpl 2024-10-14T01:48:06,841 INFO [main {}] balancer.LoadBalancerPerformanceEvaluation: Calling roundRobinAssignment 2024-10-14T01:48:06,850 INFO [main {}] balancer.BalancerClusterState: server 0 is on rack 0 2024-10-14T01:48:06,850 INFO [main {}] balancer.BalancerClusterState: server 1 is on rack 0 2024-10-14T01:48:06,850 INFO [main {}] balancer.BalancerClusterState: server 2 is on rack 0 2024-10-14T01:48:06,850 INFO [main {}] balancer.BalancerClusterState: server 3 is on rack 0 2024-10-14T01:48:06,850 INFO [main {}] balancer.BalancerClusterState: server 4 is on rack 0 2024-10-14T01:48:06,850 INFO [main {}] balancer.BalancerClusterState: server 5 is on rack 0 2024-10-14T01:48:06,850 INFO [main {}] balancer.BalancerClusterState: server 6 is on rack 0 2024-10-14T01:48:06,850 INFO [main {}] balancer.BalancerClusterState: server 7 is on rack 0 2024-10-14T01:48:06,850 INFO [main {}] balancer.BalancerClusterState: server 8 is on rack 0 2024-10-14T01:48:06,850 INFO [main {}] balancer.BalancerClusterState: server 9 is on rack 0 Time for roundRobinAssignment : 238ms 2024-10-14T01:48:07,079 INFO [main {}] balancer.LoadBalancerPerformanceEvaluation: Calling retainAssignment 2024-10-14T01:48:07,091 INFO [main {}] balancer.BalancerClusterState: server 0 is on rack 0 2024-10-14T01:48:07,091 INFO [main {}] balancer.BalancerClusterState: server 1 is on rack 0 2024-10-14T01:48:07,091 INFO [main {}] balancer.BalancerClusterState: server 2 is on rack 0 2024-10-14T01:48:07,091 INFO [main {}] balancer.BalancerClusterState: server 3 is on rack 0 2024-10-14T01:48:07,091 INFO [main {}] balancer.BalancerClusterState: server 4 is on rack 0 2024-10-14T01:48:07,091 INFO [main {}] balancer.BalancerClusterState: server 5 is on rack 0 2024-10-14T01:48:07,092 INFO [main {}] balancer.BalancerClusterState: server 6 is on rack 0 2024-10-14T01:48:07,092 INFO [main {}] balancer.BalancerClusterState: server 7 is on rack 0 2024-10-14T01:48:07,092 INFO [main {}] balancer.BalancerClusterState: server 8 is on rack 0 2024-10-14T01:48:07,092 INFO [main {}] balancer.BalancerClusterState: server 9 is on rack 0 2024-10-14T01:48:07,284 INFO [main {}] balancer.BaseLoadBalancer: Reassigned 30000 regions. 0 retained the pre-restart assignment. 30000 regions were assigned to random hosts, since the old hosts for these regions are no longer present in the cluster. These hosts were: Time for retainAssignment : 204ms 2024-10-14T01:48:07,284 INFO [main {}] balancer.LoadBalancerPerformanceEvaluation: Calling balanceCluster 2024-10-14T01:48:07,315 INFO [main {}] balancer.BalancerClusterState: server 0 is on rack 0 2024-10-14T01:48:07,315 INFO [main {}] balancer.BalancerClusterState: server 1 is on rack 0 2024-10-14T01:48:07,315 INFO [main {}] balancer.BalancerClusterState: server 2 is on rack 0 2024-10-14T01:48:07,323 INFO [main {}] balancer.BalancerClusterState: server 3 is on rack 0 2024-10-14T01:48:07,323 INFO [main {}] balancer.BalancerClusterState: server 4 is on rack 0 2024-10-14T01:48:07,323 INFO [main {}] balancer.BalancerClusterState: server 5 is on rack 0 2024-10-14T01:48:07,323 INFO [main {}] balancer.BalancerClusterState: server 6 is on rack 0 2024-10-14T01:48:07,323 INFO [main {}] balancer.BalancerClusterState: server 7 is on rack 0 2024-10-14T01:48:07,323 INFO [main {}] balancer.BalancerClusterState: server 8 is on rack 0 2024-10-14T01:48:07,323 INFO [main {}] balancer.BalancerClusterState: server 9 is on rack 0 2024-10-14T01:48:07,325 ERROR [main {}] util.AbstractHBaseTool: Error running command-line tool java.lang.NullPointerException: Cannot invoke "java.util.List.size()" because "this.candidateGenerators" is null at org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.initCosts(StochasticLoadBalancer.java:750) ~[hbase-balancer-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceTable(StochasticLoadBalancer.java:475) ~[hbase-balancer-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.balanceCluster(BaseLoadBalancer.java:620) ~[hbase-balancer-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.master.balancer.LoadBalancerPerformanceEvaluation.doWork(LoadBalancerPerformanceEvaluation.java:172) ~[hbase-balancer-4.0.0-alpha-1-SNAPSHOT-tests.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.util.AbstractHBaseTool.run(AbstractHBaseTool.java:150) ~[hbase-common-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.master.balancer.LoadBalancerPerformanceEvaluation.main(LoadBalancerPerformanceEvaluation.java:181) ~[hbase-balancer-4.0.0-alpha-1-SNAPSHOT-tests.jar:4.0.0-alpha-1-SNAPSHOT]{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)