[ 
https://issues.apache.org/jira/browse/HBASE-27496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17636000#comment-17636000
 ] 

Bryan Beaudreault commented on HBASE-27496:
-------------------------------------------

Ok, so the concern is that a single counter could cause splits (which are 
calculated first) to starve merges. Just curious how you'd configure the two 
settings to get around this problem? I guess configure both counters to half of 
what you actually want? That seems to have its own drawbacks. I also wonder how 
often there are enough of both splits and merges to cause this to be a problem, 
given typically the normalizer runs incrementally anyway (so it could just run 
again and again until all splits/merges are dealt with).

> Limit size of plans produced by SimpleRegionNormalizer
> ------------------------------------------------------
>
>                 Key: HBASE-27496
>                 URL: https://issues.apache.org/jira/browse/HBASE-27496
>             Project: HBase
>          Issue Type: Improvement
>          Components: Normalizer
>            Reporter: Charles Connell
>            Priority: Minor
>
> My company (Hubspot) is starting to use {{{}SimpleRegionNormalizer{}}}. We 
> turn the normalizer switch on for 30 minutes each day, when our database 
> traffic is at a low point. We're using theĀ 
> {{hbase.normalizer.throughput.max_bytes_per_sec}} setting to create a rate 
> limit. I've found that while the {{SimpleRegionNormalizer}} only produces new 
> plans for 30 minutes each day, the plans often take many hours to execute. 
> This leds to region splits, merges, and moves occurring in our HBase clusters 
> during hours we'd prefer them not to.
> I propose two new settings:
>  * {{hbase.normalizer.merge.plans_size_limit.mb}}
>  * {{hbase.normalizer.split.plans_size_limit.mb}}
> This will allow HBase administrators to limit the number of plans produced by 
> a run of {{{}SimpleRegionNormalizer{}}}, by forcing it to stop producing new 
> plans once the cumulative region size limits are exceeded. This will give you 
> a way to limit approximately how long it takes to execute the plans. Because 
> the current limit to execute plans is primarily determined by a per-byte rate 
> limit, I propose that the new settings also work on a similar basis. This 
> will make it feasible to reason about how your rate limit and your size 
> limits interact.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to