Heya, I've seen a lot of use-cases where the normalizer would be a nice solution for operators and application developers. I've been trying to beef it up a bit to handle these cases. However, some of these considerations are at odds, so I want to vet the ideas here.
The normalizer is a background chore in the HMaster that attempts to converge region sizes within a table toward the average region size. It has a pretty wide error bar, but that's the overall goal. Early on, it was observed that an operator needs to pre-split a table, so special considerations were included, by way of `hbase.normalizer.min.region.count`, `hbase.normalizer.merge.min_region_age.days`, and `hbase.normalizer.merge.min_region_size.mb`. All these nobs are designed to give an operator means of controlling this behavior. We have (what I see as) a competing objective: doing away with empty, or nearly-empty regions. The use-case is pretty common when there's a TTL applied to a table, especially if there's also a timestamp component in the rowkey. In this case, we want the normalizer to "merge away" these empty regions. The trouble is we ship defaults for all of the `*min*` configs, and right now there's no way to "unset" them, disable the functionality. Which means there still isn't a way to support the empty regions use-case without awkward special-case checks. This is where I'm looking for suggestions from the community. There's some discussion under way over on the PR for HBASE-24583. Please take a look. Thanks in advance, Nick
