ndimiduk commented on a change in pull request #2596:
URL: https://github.com/apache/hbase/pull/2596#discussion_r657261818



##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
##########
@@ -315,35 +316,60 @@ private boolean skipForMerge(final RegionStates 
regionStates, final RegionInfo r
    * towards target average or target region count.
    */
   private List<NormalizationPlan> computeMergeNormalizationPlans(final 
NormalizeContext ctx) {
-    if (ctx.getTableRegions().size() < minRegionCount) {
+    if (isEmpty(ctx.getTableRegions()) || ctx.getTableRegions().size() < 
minRegionCount) {
       LOG.debug("Table {} has {} regions, required min number of regions for 
normalizer to run"
         + " is {}, not computing merge plans.", ctx.getTableName(), 
ctx.getTableRegions().size(),
         minRegionCount);
       return Collections.emptyList();
     }
 
-    final double avgRegionSizeMb = ctx.getAverageRegionSizeMb();
+    final long avgRegionSizeMb = (long) ctx.getAverageRegionSizeMb();
+    if (avgRegionSizeMb < mergeMinRegionSizeMb) {

Review comment:
       I'd have to look back through the original JIRA and PR, but I believe 
there was a concern that normalizer would merge away the splits made on table 
creation, between the time when the table was created and when the operator got 
around to loading it with data. The "minimum table size" configuration was 
designed to prevent this. This behavior pre-dated HBASE-24419 ; the 
functionality was preserved when this optimization was implemented.
   
   I personally am not a fan, and prefer the "minimum table age" configuration 
for handling of this concern.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to