[ 
https://issues.apache.org/jira/browse/HADOOP-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552529
 ] 

Bryan Duxbury commented on HADOOP-2443:
---------------------------------------

So, this issue indicates that clients keep a dense list of regions and 
locations. Is there a reason why they should keep a dense list instead of a 
sparse one? In production clusters, there could be hundreds or thousands of 
regions, and caching them all seems to be pretty inefficient. 

Instead, couldn't we just cache the regions we've found through trying to use 
them? That is, when you're going to do some operation on a key, you ask the 
master/META regionserver to resolve where your key belongs, and just remember 
the location and bounds of that region. That way, you first check the cache to 
see if you already know where a key lives, and if it is a hit, you go out and 
talk to the region server. If the server hasn't got that region anymore, you 
get an NSRE, invalidate just the single cache item you used to get there. Isn't 
this how Bigtable does it?

> [hbase] Keep lazy cache of regions in client rather than an 'authoritative' 
> list
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-2443
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2443
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>
> Currently, when the client gets a NotServingRegionException -- usually 
> because its in middle of being split or there has been a regionserver crash 
> and region is being moved elsewhere -- the client does a complete refresh of 
> its cache of region locations for a table.
> Chatting with Jim about a Paul Saab upload issue from Saturday night, when 
> tables are big comprised of regions that are splitting fast (because of bulk 
> upload), its unlikely a client will ever be able to obtain a stable list of 
> all region locations.  Given that any update or scan requires that the list 
> of all regions be in place before it proceeds, this can get in the way of the 
> client succeeding when the cluster is under load.
> Chatting, we figure that it better the client holds a lazy region cache: on 
> NSRE, figure out where that region has gone only and update the client-side 
> cache for that entry only rather than throw out all we know of a table every 
> time.
> Hopefully this will fix the issue PS was experiencing where during intense 
> upload, he was unable to get/scan/hql the same table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to