[ 
https://issues.apache.org/jira/browse/HBASE-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730875#comment-14730875
 ] 

Biju Nair commented on HBASE-14215:
-----------------------------------

Primary replica distribution with 
{{hbase.master.balancer.stochastic.regionReplicaRackCostKey=0}} and 
{{hbase.master.balancer.stochastic.primaryRegionCountCost=500}}

**Randomly assigned the primaries with balancer off**
|r3n9   |112|
|r1n10  |108|
|r2n9   |116|
|r2n10  |134|
|r1n11  |119|
|r2n11  |109|
|r3n10  |119|
|r3n11  |115|
|r1n9   |95|

**Primary distribution after balancer run**
|r3n9   |114|
|r1n10  |115|
|r2n9   |114|
|r2n10  |114|
|r1n11  |114|
|r2n11  |114|
|r3n10  |114|
|r1n9   |114|
|r3n11  |114|

As expected the primary replicas seem to get uniformly distributed even with a 
low cost multiplier for 
{{hbase.master.balancer.stochastic.primaryRegionCountCost}} when rack awareness 
is disabled.


> Default cost used for PrimaryRegionCountSkewCostFunction is not sufficient 
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-14215
>                 URL: https://issues.apache.org/jira/browse/HBASE-14215
>             Project: HBase
>          Issue Type: Bug
>          Components: Balancer
>            Reporter: Biju Nair
>            Priority: Minor
>         Attachments: 14215-v1.txt
>
>
> Current multiplier of 500 used in the stochastic balancer cost function 
> {{PrimaryRegionCountSkewCostFunction}} to calculate the cost of  total 
> primary replication skew doesn't seem to be sufficient to prevent the skews 
> (Refer HBASE-14110). We would want the default cost to be a higher value so 
> that skews in primary region replica has higher cost. The following is the 
> test result by setting the multiplier value to 10000 (same as the region 
> replica rack cost multiplier) on a 3 Rack 9 RS node cluster which seems to 
> get the balancer distribute the primaries uniformly.
> *Initial Primary replica distribution - using the current multiplier* 
>  |r1n10|  102|
>  |r1n11|  85|
>  |r1n9|    88|
>  |r2n10|  120|
>  |r2n11|  120|
>  |r2n9|   124|
>  |r3n10|  135|
>  |r3n11|  124|
>  |r3n9|    129|
> *After long duration of read & writes - using current multiplier*     
> | r1n10|  102|
> | r1n11|  85|
> | r1n9|    88|
> | r2n10|  120|
> | r2n11|  120|
> | r2n9 |   124|
> | r3n10|  135|
> | r3n11|  124|
> | r3n9|    129|
> *After manual balancing*      
> | r1n10|  102|
> | r1n11|  85|
> | r1n9|    88|
> | r2n10|  120|
> | r2n11|  120|
> | r2n9 |   124|
> | r3n10|  135|
> | r3n11|  124|
> | r3n9|    129|
> *Increased multiplier for primaryRegionCountSkewCost to 10000*        
> | r1n10|  114|
> | r1n11 | 113|
> | r1n9 |   114|
> | r2n10|  114|
> | r2n11|  114|
> | r2n9 |   113|
> | r3n10|  115|
> | r3n11|  115|
> | r3n9 |   115 |
> Setting the {{PrimaryRegionCountSkewCostFunction}} multiplier value to 10000 
> should help HBase general use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to