[jira] [Commented] (HBASE-11608) Add synchronous split

2014-08-04 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085219#comment-14085219
 ] 

Jerry He commented on HBASE-11608:
--

Yes. 

I was looking at the the current synchronous HBaseAdmin APIs and how they are 
implemented.   Two different approaches.

+1+   Inside the API implementation, the first part is to send the request 
asynchronously. 
   Then the second part loops for the status via a status inquiry (direct RPC 
inquiry or indirect inquiry thru meta status, etc) until the request is 
completed.
   For example,  deleteTable, disableTable and snapshot.
   The identification used to poll the status is table name, or snapshot name 
in the above cases.

+2+   The second approach is a real synchronous call.  The server side will not 
return until the work is really completed.
   For example, flush.

Other HBaseAdmin APIs are asynchronous, for example, the current compact and 
split API. 
The client only submits the request without waiting for status at all.

For this JIRA,  the 2nd approach is cleaner.

We could do the 1st approach.  For example, the client asynchronously submits 
the split request.
Then the client loops for the split to complete by polling on the meta status 
to wait for the parent to split and daughters to come online.
This is similar to what deleteTable and disableTable do.
The identification to poll and inquiry the status is the parent region name.  
But this approach will be a little messy.
.

 Add synchronous split
 -

 Key: HBASE-11608
 URL: https://issues.apache.org/jira/browse/HBASE-11608
 Project: HBase
  Issue Type: New Feature
  Components: Admin
Affects Versions: 0.99.0, 0.98.5
Reporter: Jerry He

 Users have been asking for this. We have an internal requirement for this as 
 well.
 The goal is a provide a Admin API (and shell command) so that users can 
 request to split a region or table and get the split completion result 
 synchronously.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11608) Add synchronous split

2014-08-04 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085250#comment-14085250
 ] 

Matteo Bertozzi commented on HBASE-11608:
-

the current admin implementation is kind of broken, and since we lack in 
infrastructure to build proper sync operations overtime we implemented 
workaround to obtain the sync operation.

The problems are:
 * If you have the sync call on the server and the operation is long: 1) you 
are blocking all the other rpc handlers. 2) in case of failure client side or 
server side you don't know the state of the operation.
 * if you use the poll by relying on something like META we are stuck in 
keeping that execution order and that META format to keep the client 
compatibility. (one solution to avoid this problem is to have something like 
the isDone method for snapshot, where the server gives you the status of the 
operation. but you end up having tons of isDone for each sync operation that 
you want)

I'd say that we should stop adding features can't be implemented in a 
straightforward way and that may prevent future rolling upgrade compatibily. 
HBASE-5487 and HBASE-9864 should solve all these problems with long operations 
and operation across the cluster. 

 Add synchronous split
 -

 Key: HBASE-11608
 URL: https://issues.apache.org/jira/browse/HBASE-11608
 Project: HBase
  Issue Type: New Feature
  Components: Admin
Affects Versions: 0.99.0, 0.98.5
Reporter: Jerry He

 Users have been asking for this. We have an internal requirement for this as 
 well.
 The goal is a provide a Admin API (and shell command) so that users can 
 request to split a region or table and get the split completion result 
 synchronously.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11608) Add synchronous split

2014-08-04 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085436#comment-14085436
 ] 

Jerry He commented on HBASE-11608:
--

Linked the issues here.  HBASE-5487, HBASE-9864 and HBASE-10544

 Add synchronous split
 -

 Key: HBASE-11608
 URL: https://issues.apache.org/jira/browse/HBASE-11608
 Project: HBase
  Issue Type: New Feature
  Components: Admin
Affects Versions: 0.99.0, 0.98.5
Reporter: Jerry He

 Users have been asking for this. We have an internal requirement for this as 
 well.
 The goal is a provide a Admin API (and shell command) so that users can 
 request to split a region or table and get the split completion result 
 synchronously.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11608) Add synchronous split

2014-08-01 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14082941#comment-14082941
 ] 

Nick Dimiduk commented on HBASE-11608:
--

I was thinking of the case when the connection between client and RS is 
dropped. When that happens, the result of the split operation can not be 
discovered, other than by side-effect. There are many other meta operations 
like this where the client cannot know a result except by side-effect. As I 
understand it, resolving these kinds of issues in a general case is the primary 
motivation behind HBASE-5487.

 Add synchronous split
 -

 Key: HBASE-11608
 URL: https://issues.apache.org/jira/browse/HBASE-11608
 Project: HBase
  Issue Type: New Feature
  Components: Admin
Affects Versions: 0.99.0, 0.98.5
Reporter: Jerry He

 Users have been asking for this. We have an internal requirement for this as 
 well.
 The goal is a provide a Admin API (and shell command) so that users can 
 request to split a region or table and get the split completion result 
 synchronously.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11608) Add synchronous split

2014-07-29 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078179#comment-14078179
 ] 

Jerry He commented on HBASE-11608:
--

Linked to an old issue that raised this request.

 Add synchronous split
 -

 Key: HBASE-11608
 URL: https://issues.apache.org/jira/browse/HBASE-11608
 Project: HBase
  Issue Type: New Feature
  Components: Admin
Affects Versions: 0.99.0, 0.98.5
Reporter: Jerry He

 Users have been asking for this. We have an internal requirement for this as 
 well.
 The goal is a provide a Admin API (and shell command) so that users can 
 request to split a region or table and get the split completion result 
 synchronously.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11608) Add synchronous split

2014-07-29 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078215#comment-14078215
 ] 

Jerry He commented on HBASE-11608:
--

Carry over the design comment from HBase-2949 so we will continue the 
discussion here.

+Current implementation on the client side+

In HBaseAdmin.split(), 
1. Input is a region name
1a. If a split key is given, we will send a split request to the region server 
hosting the region containing the key. Asynchronous operation.
2b. If no split key, we will send a split request to the region server hosting 
the region to split based split policy. Asynchronous operation.
2. Input is a table name
1a. If a split key is given, we will send a split request to the region server 
hosting the region containing the key. Asynchronous operation.
2b. If no split key, we will send split requests in a loop to (region, rs) pair 
for all the regions of the table to split based split policy. Asynchronous 
operation.

+Current implementation on the region server side+

HRegionServer - RSRpcServices.splitRegion() - 
compactSplitThread.requestSplit() -- CompactSplitThread.splits.execute(new 
SplitRequest())
CompactSplitThread.splits is a ThreadPoolExecutor.
SplitRequest is a Runnable that does all the slit work.

+Goal+

Backward compatible, no impact on existing system triggered split, minimum 
overall impact, more??

+Proposed change on the region server side+

Provide a synchronous
compactSplitThread.requestSplitSynchronous() -- 
CompactSplitThread.splits.submit(new SplitRequest()) and wait for 'future' 
completion.
SplitRequest currently is a Runnable. 
Possibly change it to CallableVoid so that we can propagate up any exceptions 
during split operation
as we want to get notified any errors/exceptions synchronously as well.
HRegionServer - RSRpcServices.splitRegionSynchronous() will use the new 
requestSplitSynchronous().

+On the HBaseAdmin side+

If we go synchronous in HBaseAdmin.split(), one use case scenario will become 
less efficient. i.e. the 2.b case from above:
bq. 2b. If no split key, we will send split requests in a loop to (region, rs) 
pair for all the regions of the table to split based split policy.
Because in this case, each (region, rs) pair will be requested serially and 
synchronously.
We'll provide a HBaseAdmin.splitSynchronous() and keep the current asynchronous 
HBaseAdmin.split(). 
Users are kept aware of the choices they have.
In the future HBaseAdmin.splitSynchronous() can be improved (e.g. using Global 
procedure in split regions in parallel).

This is my initial thinking. 
Your comments please, if this is doable, if there is any concern, if I miss or 
misunderstand anything. I am sure I do :-)

 Add synchronous split
 -

 Key: HBASE-11608
 URL: https://issues.apache.org/jira/browse/HBASE-11608
 Project: HBase
  Issue Type: New Feature
  Components: Admin
Affects Versions: 0.99.0, 0.98.5
Reporter: Jerry He

 Users have been asking for this. We have an internal requirement for this as 
 well.
 The goal is a provide a Admin API (and shell command) so that users can 
 request to split a region or table and get the split completion result 
 synchronously.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11608) Add synchronous split

2014-07-29 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078343#comment-14078343
 ] 

Nick Dimiduk commented on HBASE-11608:
--

Nice feature [~jinghe].

How will your future approach handle intermittent connectivity issues between 
client and RS? Have you looked at HBASE-5487 ?

 Add synchronous split
 -

 Key: HBASE-11608
 URL: https://issues.apache.org/jira/browse/HBASE-11608
 Project: HBase
  Issue Type: New Feature
  Components: Admin
Affects Versions: 0.99.0, 0.98.5
Reporter: Jerry He

 Users have been asking for this. We have an internal requirement for this as 
 well.
 The goal is a provide a Admin API (and shell command) so that users can 
 request to split a region or table and get the split completion result 
 synchronously.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11608) Add synchronous split

2014-07-29 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078597#comment-14078597
 ] 

Jerry He commented on HBASE-11608:
--

Hi, [~ndimiduk]

I was aware of the grand plan in  HBASE-5487, and just went though it again :-)

bq. How will your future approach handle intermittent connectivity issues 
between client and RS?
If client issues synchronous split request, the client RPC connection will hold 
for longer period of time.

Or do you mean in the future if we want to use Master-coordinated tasks to splt 
the table synchronously, how do we deal with master and/or RS failures?


 Add synchronous split
 -

 Key: HBASE-11608
 URL: https://issues.apache.org/jira/browse/HBASE-11608
 Project: HBase
  Issue Type: New Feature
  Components: Admin
Affects Versions: 0.99.0, 0.98.5
Reporter: Jerry He

 Users have been asking for this. We have an internal requirement for this as 
 well.
 The goal is a provide a Admin API (and shell command) so that users can 
 request to split a region or table and get the split completion result 
 synchronously.



--
This message was sent by Atlassian JIRA
(v6.2#6252)