Pankaj Kumar commented on HBASE-9081:
Thanks [~jeason] for raising this Jira.
Multiple split points (pre-split) can be defined only at table creation and
thereafter region splits only into two daughter regions either manually (using
HBaseAdmin APIs) or automatically (based on the split policy). Currently there
is no way to split a region into multiple daughter regions, user need to send
multiple RPCs to retrieve table regions and send split request.
Based on the customer experiences, there is a need of multiple split of region
in a single operation. We can say "Region Multi Split" instead of "Online split
for an reserved empty region".
There can be multiple scenario where multi split is very much useful,
1) In the beginning user can't predict the incoming data behavior, so create
the table with default region (without pre-split). After some data load into
the table, user can predict the data distribution and define the split points
efficiently. But currently to split the region into multiple regions (let say
500) is not easy with existing APIs. User has to retrieve and split the region
2) In case where the incoming data rate is too high, with current region split
(2 daughter regions), multiple times splits is going to happen which will cause
lot of I/O and cpu resources till it reaches to its desired number of regions
(let say 500). But with the new feature, directly region can be split into the
desirable number of regions in single operation.
Let me know your thought over this, will attach the design doc soon.
> Online split for an reserved empty region
> Key: HBASE-9081
> URL: https://issues.apache.org/jira/browse/HBASE-9081
> Project: HBase
> Issue Type: New Feature
> Components: master, regionserver
> Reporter: Jieshan Bean
> Assignee: Jieshan Bean
> Priority: Major
> We already have a region splitter tool. But it can only provide limited
> 1. Create table with a specified region number without give any splits.
> 2. Roll-Split on an exist region.
> We have such user scenario:
> Table was created with splits like below:
> g~o is a reserved empty region. Will use it only after some days. So we don't
> know the rowkey distribution currently. Will split it only when it get used.
> Say, we want to split g~o with 10 new regions, likes g, g1, g2, g3, g4,
> g5.......,g9, o.
> I didn't find similar function has already been there. Please tell me if I am
> Hope to hear your ideas on this:)
This message was sent by Atlassian JIRA