[ 
https://issues.apache.org/jira/browse/HBASE-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14205866#comment-14205866
 ] 

zhangduo commented on HBASE-12451:
----------------------------------

I think there is a upper limit also, so the size will not grow too large...
{code}
  protected long getSizeToCheck(final int tableRegionsCount) {
    // safety check for 100 to avoid numerical overflow in extreme cases
    return tableRegionsCount == 0 || tableRegionsCount > 100 ? 
getDesiredMaxFileSize():
      Math.min(getDesiredMaxFileSize(),
        this.initialSize * tableRegionsCount * tableRegionsCount * 
tableRegionsCount);
  }

  this.desiredMaxFileSize = conf.getLong(HConstants.HREGION_MAX_FILESIZE,
        HConstants.DEFAULT_MAX_FILE_SIZE);

  /** Conf key for the max file size after which we split the region */
  public static final String HREGION_MAX_FILESIZE =
      "hbase.hregion.max.filesize";

  /** Default maximum file size */
  public static final long DEFAULT_MAX_FILE_SIZE = 10 * 1024 * 1024 * 1024L;
{code}

We have two reason to split, first is for load balancing, this introduces a 
increasing region split size, and second is for compaction, this introduces a 
constant region split size(which is the upper limit).

I think the first thing need to be done is define "unnecessary region splits".
If we already have 240 regions of a table, and there is only one region of this 
table on a regionserver, should the region have a small split size?

> IncreasingToUpperBoundRegionSplitPolicy may cause unnecessary region splits 
> in rolling update of cluster
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-12451
>                 URL: https://issues.apache.org/jira/browse/HBASE-12451
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Liu Shaohui
>            Assignee: Liu Shaohui
>            Priority: Minor
>             Fix For: 2.0.0
>
>
> Currently IncreasingToUpperBoundRegionSplitPolicy is the default region split 
> policy. In this policy, split size is the number of regions that are on this 
> server that all are of the same table, cubed, times 2x the region flush size.
> But when unloading regions of a regionserver in a cluster using 
> region_mover.rb, the number of regions that are on this server that all are 
> of the same table will decrease, and the split size will decrease too, which 
> may cause the left region split in the regionsever. Region Splits also 
> happens when loading regions of a regionserver in a cluster. 
> A improvment may set a minimum split size in 
> IncreasingToUpperBoundRegionSplitPolicy
> Suggestions are welcomed. Thanks~



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to