[
https://issues.apache.org/jira/browse/HBASE-12829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815935#comment-16815935
]
Biju Nair commented on HBASE-12829:
-----------------------------------
In the current version of SLB,
[Read-writeRequestCostFunction|https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbaseserver/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L1465]
extends
[CostFromRegionLoadAsRateFunction|https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L1436]
which in turn uses the [average of the region requests stored for a
period|https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L1443]
which seems to address this issue. Can this be closed?
> Request count in RegionLoad may not accurate to compute the load cost for
> region
> --------------------------------------------------------------------------------
>
> Key: HBASE-12829
> URL: https://issues.apache.org/jira/browse/HBASE-12829
> Project: HBase
> Issue Type: Improvement
> Components: Balancer
> Affects Versions: 0.99.2
> Reporter: Jianwei Cui
> Priority: Minor
>
> StochasticLoadBalancer#RequestCostFunction(ReadRequestCostFunction and
> WriteRequestCostFunction) will compute load cost for a region based on a
> number of remembered region loads. Each region load records the total count
> for read/write request at reported time since it opened. However, the request
> count will be reset if region moved, making the new reported count could not
> represent the total request. For example, if a region has high write
> throughput, the WrtieRequest in region load will be very big after onlined
> for a long time, then if the region moved, the new WriteRequest will be much
> smaller, making the region contributes much smaller to the cost of its
> belonging rs. We may need to consider the region open time to get more
> accurate region load.
> As another way, how about using read/write request count at each time slots
> instead of total request count? The total count will make older read/write
> request throughput contribute more to the cost by
> CostFromRegionLoadFunction#getRegionLoadCost:
> {code}
> protected double getRegionLoadCost(Collection<RegionLoad> regionLoadList)
> {
> double cost = 0;
> for (RegionLoad rl : regionLoadList) {
> double toAdd = getCostFromRl(rl);
> if (cost == 0) {
> cost = toAdd;
> } else {
> cost = (.5 * cost) + (.5 * toAdd);
> }
> }
> return cost;
> }
> {code}
> For example, assume the balancer now remembers three loads for a region at
> time t1, t2, t3(t1 < t2 < t3), the write request is w1, w2, w3 respectively
> for time slots [0, t1), [t1, t2), [t2, t3), so the WriteRequest in the region
> load at t1, t2, t3 will be w1, w1 + w2, w1 + w2 + w3 and the WriteRequest
> cost will be:
> {code}
> 0.5 * (w1 + w2 + w3) + 0.25 * (w1 + w2) + 0.25 * w1 = w1 + 0.75 * w2 +
> 0.5 * w3
> {code}
> The w1 contributes more to the cost than w2 and w3. However, intuitively, I
> think the recent read/write throughput should represent the current load of
> the region better than the older ones. Therefore, how about using w1, w2 and
> w3 directly when computing? Then, the cost will become:
> {code}
> 0.25 * w1 + 0.25 * w2 + 0.5 * w3
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)