[
https://issues.apache.org/jira/browse/HBASE-20361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429518#comment-16429518
]
Advertising
Ted Yu commented on HBASE-20361:
--------------------------------
Since the new test is for TableInputFormatBase, I think it is in right place.
Can you add javadoc for the classes added in TestTableInputFormatBase
explaining what they do ?
{code}
592 protected RegionSizeCalculator newRegionSizeCalculator(RegionLocator
locator, Admin admin)
{code}
newRegionSizeCalculator -> createRegionSizeCalculator
Thanks
> Non-succesisve TableInputSplits may wrongly be merged by auto balancing
> feature
> -------------------------------------------------------------------------------
>
> Key: HBASE-20361
> URL: https://issues.apache.org/jira/browse/HBASE-20361
> Project: HBase
> Issue Type: Bug
> Components: mapreduce
> Reporter: Yuki Tawara
> Priority: Major
> Attachments: HBASE-20361.master.001.patch
>
>
> TableInputFormatBase class offers users a mechanism to exclude specific
> splits from returned list of TableInputFormatBase#getSplits through
> TableInputFormatBase#includeRegionInSplit.
> It also offers users a feature called "auto balancing" to mitigate data skew
> by splitting large splits and merging small splits.
> If a user overrides TableInputFormatBase#includeRegionInSplit, i th split and
> i+1 th split may not be successive(i th split's end key is smaller than i+1
> th split's start key).
> If he or she further enable auto balancing feature, non-successive splits can
> be merged, which means excluded splits between merged non-successive splits
> "revive".
> To avoid such cases, we should not merge non-successive splits.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)