[
https://issues.apache.org/jira/browse/HBASE-23035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938176#comment-16938176
]
Guanghao Zhang commented on HBASE-23035:
----------------------------------------
There are two problems about the LoadBalancer.
1. The cluster means the cluster state of the whole cluster. But
hasRegionReplica is false, so it only create clusterstate by the regions which
need to assign, not the whole cluster...
{code:java}
Cluster cluster = createCluster(servers, regions, false);
List<RegionInfo> unassignedRegions = new ArrayList<>();
roundRobinAssignment(cluster, regions, unassignedRegions,
servers, assignments);
protected Cluster createCluster(List<ServerName> servers,
Collection<RegionInfo> regions,
boolean hasRegionReplica) {
// Get the snapshot of the current assignments for the regions in question,
and then create
// a cluster out of it. Note that we might have replicas already assigned
to some servers
// earlier. So we want to get the snapshot to see those assignments, but
this will only contain
// replicas of the regions that are passed (for performance).
Map<ServerName, List<RegionInfo>> clusterState = null;
if (!hasRegionReplica) {
clusterState = getRegionAssignmentsByServer(regions);
} else {
// for the case where we have region replica it is better we get the
entire cluster's snapshot
clusterState = getRegionAssignmentsByServer(null);
} for (ServerName server : servers) {
if (!clusterState.containsKey(server)) {
clusterState.put(server, EMPTY_REGION_LIST);
}
}
return new Cluster(regions, clusterState, null, this.regionFinder,
rackManager);
}
{code}
2. wouldLowerAvailability method only consider the primary regions. The replica
region can't assign to same server with primary region. But can be assigned to
same server with other replica regions.
> Retain region to the last RegionServer make the failover slower
> ---------------------------------------------------------------
>
> Key: HBASE-23035
> URL: https://issues.apache.org/jira/browse/HBASE-23035
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 3.0.0, 2.3.0, 2.2.1, 2.1.6
> Reporter: Guanghao Zhang
> Assignee: Guanghao Zhang
> Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.2.2
>
>
> Now if one RS crashed, the regions will try to use the old location for the
> region deploy. But one RS only have 3 threads to open region by default. If a
> RS have hundreds of regions, the failover is very slower. Assign to same RS
> may have good locality if the Datanode is deploied on same host. But slower
> failover make the availability worse. And the locality is not big deal when
> deploy HBase on cloud.
> This was introduced by HBASE-18946.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)