[jira] Commented: (HBASE-2468) Improvements to prewarm META cache on clients

HBase Review Board (JIRA) Wed, 09 Jun 2010 15:42:40 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877244#action_12877244
 ]

HBase Review Board commented on HBASE-2468:
-------------------------------------------

Message from: "Mingjie Lai" <[email protected]>

bq.  On 2010-06-07 14:23:42, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java, line 96
bq.  > <http://review.hbase.org/r/98/diff/5/?file=944#file944line96>
bq.  >
bq.  >     getRowOrBefore is an expensive call.  Are we sure we are not calling 
this too often?

I agree it is an expensive call. 

However I don't think it would bring any performance penalty for existing and 
potential use cases:
Use case 1 -- existing MetaScanner users: since this method is newly added, 
existing users won't be affected; 
Use case 2 -- hbase clients when locating a region :  
1) if prefetch is on, it calls this MetaScanner with [table + row combination], 
which calls getRowOrBefore() to get current region info, then number of 
following regions from meta. After that, the client can get the region info 
directly from cache. 
2) if prefetch is disabled (current behavior), it eventually calls similar 
method getClosestRowBefore() to get desired region. 

So no matter prefetch is on or not, getRowOrBefore(or getClosestRowBefore) 
eventually is called. The only difference is whether to scan following regions 
from meta or not. 

For future MetaScanner users which scan from one region with desired use table 
row, it has to take the effort since it is the expected behavior. 

- Mingjie

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/98/#review144
-----------------------------------------------------------

> Improvements to prewarm META cache on clients
> ---------------------------------------------
>
>                 Key: HBASE-2468
>                 URL: https://issues.apache.org/jira/browse/HBASE-2468
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: Todd Lipcon
>            Assignee: Mingjie Lai
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2468-trunk.patch
>
>
> A couple different use cases cause storms of reads to META during startup. 
> For example, a large MR job will cause each map task to hit meta since it 
> starts with an empty cache.
> A couple possible improvements have been proposed:
>  - MR jobs could ship a copy of META for the table in the DistributedCache
>  - Clients could prewarm cache by doing a large scan of all the meta for the 
> table instead of random reads for each miss
>  - Each miss could fetch ahead some number of rows in META

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2468) Improvements to prewarm META cache on clients

Reply via email to