cuijianwei created HBASE-12762:
----------------------------------

             Summary: Region with no hfiles will have the highest locality cost 
in LocalityCostFunction
                 Key: HBASE-12762
                 URL: https://issues.apache.org/jira/browse/HBASE-12762
             Project: HBase
          Issue Type: Improvement
          Components: Balancer
    Affects Versions: 0.99.2
            Reporter: cuijianwei
            Priority: Minor


The locality cost of region will be computed in LocalityCostFunction.cost as:
{code}
double cost() {
        ...
        int index = -1;
        for (int j = 0; j < regionLocations.length; j++) {
          if (regionLocations[j] >= 0 && regionLocations[j] == serverIndex) {
            index = j;
            break;
          }
        }

        if (index < 0) {
          cost += 1;  // ==> region with no hfiles will have the highest cost
        } else {
          cost += (double) index / (double) regionLocations.length;
        }
        ...
    }
{code}
The region with no hfiles(such as empty region) will have the highest cost 
which represents the worst case that region located in the server with no 
locality for hfiles. However, this might be the best case because there are no 
hlogs for the region. Although the absolute cost value won't affect the balance 
process, will it be more reasonable to have zero cost for such regions, such as:
{code}
   ...
        if (index < 0) {
          if (regionLocation.length > 0) { //  ==> only consider regions with 
hfiles
              cost += 1;
          }
        } else {
          cost += (double) index / (double) regionLocations.length;
        }
   ...
{code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to