[ 
https://issues.apache.org/jira/browse/HBASE-14867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020207#comment-15020207
 ] 

Enis Soztutar commented on HBASE-14867:
---------------------------------------

SimpleRN is too simple :) 
I think we should introduce a more sophisticated RN. 
 - Run every 5 min by default rather than 30 min. This will be similar to 
balancer.
 - SRN computes only 1 action per run. This clearly will not work with with 10K 
region tables. We should be able to compute a batch of normalization plans.
 - RN should look at best possible actions in terms of splits or merges not 
only for smallest or largest regions. In a single pass, we should be able to 
calculate whether to split or merge for every pair of neighbors. 

> SimpleRegionNormalizer needs to have better heuristics to trigger merge 
> operation
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-14867
>                 URL: https://issues.apache.org/jira/browse/HBASE-14867
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 1.2.0
>            Reporter: Romil Choksi
>
> SimpleRegionNormalizer needs to have better heuristics to trigger merge 
> operation. SimpleRegionNormalizer is not able to trigger a merge action if 
> the table's smallest region has neighboring regions that are larger than 
> table's average region size, whereas there are other smaller regions whose 
> combined size is less than the average region size. 
> For example, 
> - Consider a table with six region, say r1 to r6. 
> - Keep r1 as empty and create some data say, 100K rows of data for each of 
> the regions r2, r3 and r4. Create smaller amount of data for regions r5 and 
> r6, say about 27K rows of data.
> - Run the normalizer. Verify the number the regions for that table and also 
> check the master log to see if any merge action was triggered as a result of 
> normalization. 
> In such scenario, it would be better to have a merge action triggered for 
> those two smaller regions r5 and r6 even though either of them is not the 
> smallest one



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to