[ 
https://issues.apache.org/jira/browse/HBASE-23035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938176#comment-16938176
 ] 

Guanghao Zhang commented on HBASE-23035:
----------------------------------------

There are two problems about the LoadBalancer.

 

1. The cluster means the cluster state of the whole cluster. But 
hasRegionReplica is false, so it only create clusterstate by the regions which 
need to assign, not the whole cluster...
{code:java}
Cluster cluster = createCluster(servers, regions, false);
List<RegionInfo> unassignedRegions = new ArrayList<>();
roundRobinAssignment(cluster, regions, unassignedRegions,
  servers, assignments);


  protected Cluster createCluster(List<ServerName> servers, 
Collection<RegionInfo> regions,
      boolean hasRegionReplica) {
    // Get the snapshot of the current assignments for the regions in question, 
and then create
    // a cluster out of it. Note that we might have replicas already assigned 
to some servers
    // earlier. So we want to get the snapshot to see those assignments, but 
this will only contain
    // replicas of the regions that are passed (for performance).
    Map<ServerName, List<RegionInfo>> clusterState = null;
    if (!hasRegionReplica) {
      clusterState = getRegionAssignmentsByServer(regions);
    } else {
      // for the case where we have region replica it is better we get the 
entire cluster's snapshot
      clusterState = getRegionAssignmentsByServer(null);
    }    for (ServerName server : servers) {
      if (!clusterState.containsKey(server)) {
        clusterState.put(server, EMPTY_REGION_LIST);
      }
    }
    return new Cluster(regions, clusterState, null, this.regionFinder,
        rackManager);
  }
{code}
2. wouldLowerAvailability method only consider the primary regions. The replica 
region can't assign to same server with primary region. But can be assigned to 
same server with other replica regions.

> Retain region to the last RegionServer make the failover slower
> ---------------------------------------------------------------
>
>                 Key: HBASE-23035
>                 URL: https://issues.apache.org/jira/browse/HBASE-23035
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0, 2.3.0, 2.2.1, 2.1.6
>            Reporter: Guanghao Zhang
>            Assignee: Guanghao Zhang
>            Priority: Major
>             Fix For: 3.0.0, 2.3.0, 2.2.2
>
>
> Now if one RS crashed, the regions will try to use the old location for the 
> region deploy. But one RS only have 3 threads to open region by default. If a 
> RS have hundreds of regions, the failover is very slower. Assign to same RS 
> may have good locality if the Datanode is deploied on same host. But slower 
> failover make the availability worse. And the locality is not big deal when 
> deploy HBase on cloud.
> This was introduced by HBASE-18946.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to