[
https://issues.apache.org/jira/browse/HBASE-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13866327#comment-13866327
]
Michael Webster commented on HBASE-10003:
-----------------------------------------
I have a thought on how to do this, although I am unfamiliar with the merge
internals. It seems like you could just start merging the first and second
regions, then merge region 3 if the combined region sizes are below the value
of the '-max' parameter. Once adding a region would put the new region size
over '-max', start the process again.
Basically the rule would look like this:
merge(R1,R2) into R1`
If the R1`.size + R3.size < MAX
merge(R1`,R3)
I guess this could also be done asynchronously. Instead of immediately merging
the regions, add them to a "todo" list, once you hit the size limit, you send
the todo list off to an executor to do the recursive merge. Being unfamiliar
with merge internals though, I don't know if asynchronous merges can/should be
done. I can imagine that causing some issues with holes in the region chain in
.META.
Any feedback is welcome.
> OnlineMerge should be extended to allow bulk merging
> ----------------------------------------------------
>
> Key: HBASE-10003
> URL: https://issues.apache.org/jira/browse/HBASE-10003
> Project: HBase
> Issue Type: Improvement
> Components: Admin, Usability
> Affects Versions: 0.98.0, 0.94.6
> Reporter: Clint Heath
> Priority: Critical
> Labels: noob
>
> Now that we have Online Merge capabilities, the function of that tool should
> be extended to make it much easier for HBase operations folks to use.
> Currently it is a very manual process (one fraught with confusion) to hand
> pick two regions that are contiguous to each other in the META table such
> that the admin can manually request those two regions to be merged.
> In the real world, when admins find themselves wanting to merge regions, it's
> usually because they've greatly increased their hbase.hregion.max.filesize
> property and they have way too many regions on a table and want to reduce the
> region count for that entire table quickly and easily.
> Why can't the OnlineMerge command just take a "-max" argument along with a
> table name which tells it to go ahead and merge all regions of said table
> until the resulting regions are all of max size? This takes the voodoo out
> of the process and quickly gets the admin what they're looking for.
> As part of this improvement, I also suggest a "-regioncount" argument for
> OnlineMerge, which will attempt to reduce the table's region count down to
> the specified #.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)