[ 
https://issues.apache.org/jira/browse/HBASE-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076985#comment-14076985
 ] 

Jerry He commented on HBASE-2949:
---------------------------------

Hi, [~stack]

Thanks for the input.

I was just looking into the code and started with some experimental code. Here 
is the what I think on providing a user triggered synchronous split.
(Will carry over to the a new JIRA afterwards).

+Current implementation on the client side+

In HBaseAdmin.split(), 
1.   Input is a region name
1a.     If a split key is given, we will send a split request to the region 
server hosting the region containing the key. Asynchronous operation.
2b.     If no split key, we will send a split request to the region server 
hosting the region to split based split policy.  Asynchronous operation.
2.   Input is a table name
1a.     If a split key is given, we will send a split request to the region 
server hosting the region containing the key.  Asynchronous operation.
2b.     If no split key, we will send split requests in a loop to (region, rs) 
pair for all the regions of the table to split based split policy. Asynchronous 
operation.

+Current implementation on the region server side+

HRegionServer -> RSRpcServices.splitRegion() -> 
compactSplitThread.requestSplit() --> CompactSplitThread.splits.execute(new 
SplitRequest())
CompactSplitThread.splits is a ThreadPoolExecutor.
SplitRequest is a Runnable that does all the slit work.

+Goal+

Backward compatible, no impact on existing system triggered split, minimum 
overall impact, more??

+Proposed change on the region server side+

Provide a synchronous
compactSplitThread.requestSplitSynchronous() --> 
CompactSplitThread.splits.submit(new SplitRequest()) and wait for 'future' 
completion.
SplitRequest currently is a Runnable.  
Possibly change it to Callable<Void> so that we can propagate up any exceptions 
during split operation
as we want get any errors/exceptions synchronously as well. 

HRegionServer -> RSRpcServices.splitRegionSynchronous() will use the new 
requestSplitSynchronous().

+On the HBaseAdmin side+

If we go synchronous in HBaseAdmin.split(), one use case scenario will become 
less efficient. i.e. the 2.b case from above:
bq. 2b.     If no split key, we will send split requests in a loop to (region, 
rs) pair for all the regions of the table to split based split policy.
Because in this case, each (region, rs) pair will be requested serially and 
synchronously.

We'll provide a HBaseAdmin.splitSynchronous() and keep the current asynchronous 
HBaseAdmin.split(). 
Users are kept aware of the choices they have.
In the future HBaseAdmin.splitSynchronous() can be improved (e.g. using Global 
procedure in split regions in parallel).

This is my initial thinking.   
Your comments please, if this is doable, if there is any concern, if I miss or 
misunderstand anything.  I am I ado :-)

> Add synchronous compact/split/flush or add a callback that gets pulled when 
> compact/split/flush completes (and a progressable on how much is done)
> --------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2949
>                 URL: https://issues.apache.org/jira/browse/HBASE-2949
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>
> Users have asked for this w/ a while.  They start a major compaction and are 
> a little baffled when the shell returns immediately.  We need to make the 
> call synchronous or add a callback or provide a progressable so user can see 
> how complete the task is.  Should we include a cancel task?  "Hey! Why did 
> everything go slow of a sudden?" .... 5 minutes later "A major compaction 
> started on hbase... how do we turn it off?"
> The need for this feature was also mentioned in HBASE-2701



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to