[ 
https://issues.apache.org/jira/browse/HBASE-24298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158851#comment-17158851
 ] 

star edited comment on HBASE-24298 at 7/16/20, 3:08 AM:
--------------------------------------------------------

A prototype implementation for review. It is a configurable feature in hbase 
client. You can activate the optimization per table by setting properties as 
following.
{code:java}
// table 'user' is optimized with ByteCopyOnWriteArrayMap
conf.setString("hbase.client.meta.cache.optimize.user", "4,2");{code}
Hbase client will cache meta data of table 'user', and locate regions in an 
optimized way. "4,2" means skipping 4 bytes and using followed 2 bytes to 
contract the ranger of binary search.

The optimization reduce 50% of cpu time in a simple performance test case from 
our production environment TestByteCopyOnWriteMaps#testLowerKeyPerformance.

 

 


was (Author: starphin):
A prototype implementation for review. It is a configurable feature in hbase 
client. You can activate the optimization per table by setting properties as 
following.
{code:java}
// table 'user' is optimized with ByteCopyOnWriteArrayMap
conf.setString("hbase.client.meta.cache.optimize.user", "4,2");{code}
Hbase client will cache meta data of table 'user', and locate regions in an 
optimized way. "4,2" means skipping 4 bytes and using followed 2 bytes to 
contract the ranger of binary search.

> Reduce cpu load of locating region especially in batch mode.
> ------------------------------------------------------------
>
>                 Key: HBASE-24298
>                 URL: https://issues.apache.org/jira/browse/HBASE-24298
>             Project: HBase
>          Issue Type: Bug
>            Reporter: star
>            Assignee: star
>            Priority: Major
>         Attachments: HBASE-24298.patch, locating region.svg
>
>
> Binary search is used to speedup the process of locating region. It is 
> already fast enough, while cpu of HBASE client becomes the bottleneck when 
> doing TCSB benchmark. We can make the process of locating region faster to 
> reduce cpu load in some special cases , which however is our common case in 
> production environment.  It is the case: 
>         1. Predefined splits in uniform distribution.
>   
>          2. Load data in batch mode.
> The optimization is very simple, just to contract range of binary search. 
> Initially,  record all startIndex and endIndex of first or two bytes of keys. 
> When a region key comes, find the contracted startIndex and endIndex of the 
> key. Then return to normal binary search process with the specified 
> startIndex and endIndex. 
> Then we can ideally reduce cpu to 1/8 with 1 byte or 1/16 with 2 bytes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to