[ 
https://issues.apache.org/jira/browse/HBASE-12853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277413#comment-14277413
 ] 

Michael Segel  commented on HBASE-12853:
----------------------------------------

Lars, 
No it will be all server side. 
That's the beauty of it. The client won't know anything about the underlying 
differences. 

Today, you can easily do this client side and then you have the responsibility 
for managing the N scanners and merging the result set(s). The idea is to do 
this server side so that clients won't need to know any of the details. 

Again, Phoenix implies that it does something like this. However, having a 
tighter coupling to HBase would mean that there is no client side changes.  
Clients would have one API to get data from a regular table or one that used 
buckets. The only difference would be in the table definition and parameters 
for the table. 

Does that make sense? 


> distributed write pattern to replace ad hoc 'salting'
> -----------------------------------------------------
>
>                 Key: HBASE-12853
>                 URL: https://issues.apache.org/jira/browse/HBASE-12853
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Michael Segel 
>            Priority: Minor
>
> In reviewing HBASE-11682 (Description of Hot Spotting), one of the issues is 
> that while 'salting' alleviated  regional hot spotting, it increased the 
> complexity required to utilize the data.  
> Through the use of coprocessors, it should be possible to offer a method 
> which distributes the data on write across the cluster and then manages 
> reading the data returning a sort ordered result set, abstracting the 
> underlying process. 
> On table creation, a flag is set to indicate that this is a parallel table. 
> On insert in to the table, if the flag is set to true then a prefix is added 
> to the key.  e.g. <region server#>- or <region server #|| where the region 
> server # is an integer between 1 and the number of region servers defined.  
> On read (scan) for each region server defined, a separate scan is created 
> adding the prefix. Since each scan will be in sort order, its possible to 
> strip the prefix and return the lowest value key from each of the subsets. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to