[
https://issues.apache.org/jira/browse/HBASE-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14205866#comment-14205866
]
zhangduo commented on HBASE-12451:
----------------------------------
I think there is a upper limit also, so the size will not grow too large...
{code}
protected long getSizeToCheck(final int tableRegionsCount) {
// safety check for 100 to avoid numerical overflow in extreme cases
return tableRegionsCount == 0 || tableRegionsCount > 100 ?
getDesiredMaxFileSize():
Math.min(getDesiredMaxFileSize(),
this.initialSize * tableRegionsCount * tableRegionsCount *
tableRegionsCount);
}
this.desiredMaxFileSize = conf.getLong(HConstants.HREGION_MAX_FILESIZE,
HConstants.DEFAULT_MAX_FILE_SIZE);
/** Conf key for the max file size after which we split the region */
public static final String HREGION_MAX_FILESIZE =
"hbase.hregion.max.filesize";
/** Default maximum file size */
public static final long DEFAULT_MAX_FILE_SIZE = 10 * 1024 * 1024 * 1024L;
{code}
We have two reason to split, first is for load balancing, this introduces a
increasing region split size, and second is for compaction, this introduces a
constant region split size(which is the upper limit).
I think the first thing need to be done is define "unnecessary region splits".
If we already have 240 regions of a table, and there is only one region of this
table on a regionserver, should the region have a small split size?
> IncreasingToUpperBoundRegionSplitPolicy may cause unnecessary region splits
> in rolling update of cluster
> --------------------------------------------------------------------------------------------------------
>
> Key: HBASE-12451
> URL: https://issues.apache.org/jira/browse/HBASE-12451
> Project: HBase
> Issue Type: Bug
> Reporter: Liu Shaohui
> Assignee: Liu Shaohui
> Priority: Minor
> Fix For: 2.0.0
>
>
> Currently IncreasingToUpperBoundRegionSplitPolicy is the default region split
> policy. In this policy, split size is the number of regions that are on this
> server that all are of the same table, cubed, times 2x the region flush size.
> But when unloading regions of a regionserver in a cluster using
> region_mover.rb, the number of regions that are on this server that all are
> of the same table will decrease, and the split size will decrease too, which
> may cause the left region split in the regionsever. Region Splits also
> happens when loading regions of a regionserver in a cluster.
> A improvment may set a minimum split size in
> IncreasingToUpperBoundRegionSplitPolicy
> Suggestions are welcomed. Thanks~
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)