cuijianwei created HBASE-12762:
----------------------------------
Summary: Region with no hfiles will have the highest locality cost
in LocalityCostFunction
Key: HBASE-12762
URL: https://issues.apache.org/jira/browse/HBASE-12762
Project: HBase
Issue Type: Improvement
Components: Balancer
Affects Versions: 0.99.2
Reporter: cuijianwei
Priority: Minor
The locality cost of region will be computed in LocalityCostFunction.cost as:
{code}
double cost() {
...
int index = -1;
for (int j = 0; j < regionLocations.length; j++) {
if (regionLocations[j] >= 0 && regionLocations[j] == serverIndex) {
index = j;
break;
}
}
if (index < 0) {
cost += 1; // ==> region with no hfiles will have the highest cost
} else {
cost += (double) index / (double) regionLocations.length;
}
...
}
{code}
The region with no hfiles(such as empty region) will have the highest cost
which represents the worst case that region located in the server with no
locality for hfiles. However, this might be the best case because there are no
hlogs for the region. Although the absolute cost value won't affect the balance
process, will it be more reasonable to have zero cost for such regions, such as:
{code}
...
if (index < 0) {
if (regionLocation.length > 0) { // ==> only consider regions with
hfiles
cost += 1;
}
} else {
cost += (double) index / (double) regionLocations.length;
}
...
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)