[
https://issues.apache.org/jira/browse/HBASE-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Biju Nair updated HBASE-14215:
------------------------------
Description:
Current multiplier of 500 used in the stochastic balancer cost function
{{PrimaryRegionCountSkewCostFunction}} to calculate the cost of total primary
replication skew doesn't seem to be sufficient to prevent the skews (Refer
HBASE-14110). We would want the default cost to be a higher value so that skews
in primary region replica has higher cost. The following is the test result by
setting the multiplier value to 10000 (same as the region replica rack cost
multiplier) on a 3 Rack 9 RS node cluster which seems to get the balancer
distribute the primaries uniformly.
*Initial Primary replica distribution - using the current multiplier*
|r1n10| 102|
|r1n11| 85|
|r1n9| 88|
|r2n10| 120|
|r2n11| 120|
|r2n9| 124|
|r3n10| 135|
|r3n11| 124|
|r3n9| 129|
*After long duration of read & writes - using current multiplier*
| r1n10| 102|
| r1n11| 85|
| r1n9| 88|
| r2n10| 120|
| r2n11| 120|
| r2n9 | 124|
| r3n10| 135|
| r3n11| 124|
| r3n9| 129|
*After manual balancing*
| r1n10| 102|
| r1n11| 85|
| r1n9| 88|
| r2n10| 120|
| r2n11| 120|
| r2n9 | 124|
| r3n10| 135|
| r3n11| 124|
| r3n9| 129|
*Increased multiplier for primaryRegionCountSkewCost to 10000*
| r1n10| 114|
| r1n11 | 113|
| r1n9 | 114|
| r2n10| 114|
| r2n11| 114|
| r2n9 | 113|
| r3n10| 115|
| r3n11| 115|
| r3n9 | 115 |
Setting the {{PrimaryRegionCountSkewCostFunction}} multiplier value to 10000
should help HBase general use.
was:
Current multiplier of 500 used in the stochastic balancer cost function
{{PrimaryRegionCountSkewCostFunction}} to calculate the cost of total primary
replication skew doesn't seem to be sufficient to prevent the skews (Refer
HBASE-14110). We would want the default cost to be a higher value so that skews
in primary region replica has higher cost. The following is the test result by
setting the multiplier value to 10000 (same as the region replica rack cost
multiplier) on a 3 Rack 9 RS node cluster which seems to get the balancer
distribute the primaries uniformly.
*Initial Primary replica distribution - using the current multiplier*
|r1n10| 102|
|r1n11| 85|
|r1n9| 88|
|r2n10| 120|
|r2n11| 120|
|r2n9| 124|
|r3n10| 135|
|r3n11| 124|
|r3n9| 129|
*After long duration of read & writes - using current multiplier*
r1n10 102
r1n11 85
r1n9 88
r2n10 120
r2n11 120
r2n9 124
r3n10 135
r3n11 124
r3n9 129
*After manual balancing*
r1n10 102
r1n11 85
r1n9 88
r2n10 120
r2n11 120
r2n9 124
r3n10 135
r3n11 124
r3n9 129
*Increased multiplier for primaryRegionCountSkewCost to 10000*
r1n10 114
r1n11 113
r1n9 114
r2n10 114
r2n11 114
r2n9 113
r3n10 115
r3n11 115
r3n9 115
Setting the {{PrimaryRegionCountSkewCostFunction}} multiplier value to 10000
should help HBase general use.
> Default cost used for PrimaryRegionCountSkewCostFunction is not sufficient
> ---------------------------------------------------------------------------
>
> Key: HBASE-14215
> URL: https://issues.apache.org/jira/browse/HBASE-14215
> Project: HBase
> Issue Type: Bug
> Components: Balancer
> Reporter: Biju Nair
> Priority: Minor
> Attachments: 14215-v1.txt
>
>
> Current multiplier of 500 used in the stochastic balancer cost function
> {{PrimaryRegionCountSkewCostFunction}} to calculate the cost of total
> primary replication skew doesn't seem to be sufficient to prevent the skews
> (Refer HBASE-14110). We would want the default cost to be a higher value so
> that skews in primary region replica has higher cost. The following is the
> test result by setting the multiplier value to 10000 (same as the region
> replica rack cost multiplier) on a 3 Rack 9 RS node cluster which seems to
> get the balancer distribute the primaries uniformly.
> *Initial Primary replica distribution - using the current multiplier*
> |r1n10| 102|
> |r1n11| 85|
> |r1n9| 88|
> |r2n10| 120|
> |r2n11| 120|
> |r2n9| 124|
> |r3n10| 135|
> |r3n11| 124|
> |r3n9| 129|
> *After long duration of read & writes - using current multiplier*
> | r1n10| 102|
> | r1n11| 85|
> | r1n9| 88|
> | r2n10| 120|
> | r2n11| 120|
> | r2n9 | 124|
> | r3n10| 135|
> | r3n11| 124|
> | r3n9| 129|
> *After manual balancing*
> | r1n10| 102|
> | r1n11| 85|
> | r1n9| 88|
> | r2n10| 120|
> | r2n11| 120|
> | r2n9 | 124|
> | r3n10| 135|
> | r3n11| 124|
> | r3n9| 129|
> *Increased multiplier for primaryRegionCountSkewCost to 10000*
> | r1n10| 114|
> | r1n11 | 113|
> | r1n9 | 114|
> | r2n10| 114|
> | r2n11| 114|
> | r2n9 | 113|
> | r3n10| 115|
> | r3n11| 115|
> | r3n9 | 115 |
> Setting the {{PrimaryRegionCountSkewCostFunction}} multiplier value to 10000
> should help HBase general use.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)