[
https://issues.apache.org/jira/browse/HBASE-27496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17636000#comment-17636000
]
Bryan Beaudreault commented on HBASE-27496:
-------------------------------------------
Ok, so the concern is that a single counter could cause splits (which are
calculated first) to starve merges. Just curious how you'd configure the two
settings to get around this problem? I guess configure both counters to half of
what you actually want? That seems to have its own drawbacks. I also wonder how
often there are enough of both splits and merges to cause this to be a problem,
given typically the normalizer runs incrementally anyway (so it could just run
again and again until all splits/merges are dealt with).
> Limit size of plans produced by SimpleRegionNormalizer
> ------------------------------------------------------
>
> Key: HBASE-27496
> URL: https://issues.apache.org/jira/browse/HBASE-27496
> Project: HBase
> Issue Type: Improvement
> Components: Normalizer
> Reporter: Charles Connell
> Priority: Minor
>
> My company (Hubspot) is starting to use {{{}SimpleRegionNormalizer{}}}. We
> turn the normalizer switch on for 30 minutes each day, when our database
> traffic is at a low point. We're using theĀ
> {{hbase.normalizer.throughput.max_bytes_per_sec}} setting to create a rate
> limit. I've found that while the {{SimpleRegionNormalizer}} only produces new
> plans for 30 minutes each day, the plans often take many hours to execute.
> This leds to region splits, merges, and moves occurring in our HBase clusters
> during hours we'd prefer them not to.
> I propose two new settings:
> * {{hbase.normalizer.merge.plans_size_limit.mb}}
> * {{hbase.normalizer.split.plans_size_limit.mb}}
> This will allow HBase administrators to limit the number of plans produced by
> a run of {{{}SimpleRegionNormalizer{}}}, by forcing it to stop producing new
> plans once the cumulative region size limits are exceeded. This will give you
> a way to limit approximately how long it takes to execute the plans. Because
> the current limit to execute plans is primarily determined by a per-byte rate
> limit, I propose that the new settings also work on a similar basis. This
> will make it feasible to reason about how your rate limit and your size
> limits interact.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)