[
https://issues.apache.org/jira/browse/HBASE-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078215#comment-14078215
]
Jerry He commented on HBASE-11608:
----------------------------------
Carry over the design comment from HBase-2949 so we will continue the
discussion here.
+Current implementation on the client side+
In HBaseAdmin.split(),
1. Input is a region name
1a. If a split key is given, we will send a split request to the region server
hosting the region containing the key. Asynchronous operation.
2b. If no split key, we will send a split request to the region server hosting
the region to split based split policy. Asynchronous operation.
2. Input is a table name
1a. If a split key is given, we will send a split request to the region server
hosting the region containing the key. Asynchronous operation.
2b. If no split key, we will send split requests in a loop to (region, rs) pair
for all the regions of the table to split based split policy. Asynchronous
operation.
+Current implementation on the region server side+
HRegionServer -> RSRpcServices.splitRegion() ->
compactSplitThread.requestSplit() --> CompactSplitThread.splits.execute(new
SplitRequest())
CompactSplitThread.splits is a ThreadPoolExecutor.
SplitRequest is a Runnable that does all the slit work.
+Goal+
Backward compatible, no impact on existing system triggered split, minimum
overall impact, more??
+Proposed change on the region server side+
Provide a synchronous
compactSplitThread.requestSplitSynchronous() -->
CompactSplitThread.splits.submit(new SplitRequest()) and wait for 'future'
completion.
SplitRequest currently is a Runnable.
Possibly change it to Callable<Void> so that we can propagate up any exceptions
during split operation
as we want to get notified any errors/exceptions synchronously as well.
HRegionServer -> RSRpcServices.splitRegionSynchronous() will use the new
requestSplitSynchronous().
+On the HBaseAdmin side+
If we go synchronous in HBaseAdmin.split(), one use case scenario will become
less efficient. i.e. the 2.b case from above:
bq. 2b. If no split key, we will send split requests in a loop to (region, rs)
pair for all the regions of the table to split based split policy.
Because in this case, each (region, rs) pair will be requested serially and
synchronously.
We'll provide a HBaseAdmin.splitSynchronous() and keep the current asynchronous
HBaseAdmin.split().
Users are kept aware of the choices they have.
In the future HBaseAdmin.splitSynchronous() can be improved (e.g. using Global
procedure in split regions in parallel).
This is my initial thinking.
Your comments please, if this is doable, if there is any concern, if I miss or
misunderstand anything. I am sure I do :-)
> Add synchronous split
> ---------------------
>
> Key: HBASE-11608
> URL: https://issues.apache.org/jira/browse/HBASE-11608
> Project: HBase
> Issue Type: New Feature
> Components: Admin
> Affects Versions: 0.99.0, 0.98.5
> Reporter: Jerry He
>
> Users have been asking for this. We have an internal requirement for this as
> well.
> The goal is a provide a Admin API (and shell command) so that users can
> request to split a region or table and get the split completion result
> synchronously.
--
This message was sent by Atlassian JIRA
(v6.2#6252)