[
https://issues.apache.org/jira/browse/HBASE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870886#action_12870886
]
Mingjie Lai commented on HBASE-2468:
------------------------------------
Thanks for your responses. Based on the discussions especially what were
suggected by Todd and Andrew, here are my new proposal to address this issue.
0) (The biggest argument is regarding whole cache prewarm when table
initialization. So it won't be supported any more. )
1) Give clients the option to warm the cache for *each table*. New interfaces
are provided to enable/disable cache prefetch in case of a cache miss.
{code}
public interface HConnection {
// ...
/**
* Enable or disable region cache prewarm for the <i>tableName</i>.
*/
public void setRegionCachePreWarm(final byte[] tableName, boolean enable);
public boolean getRegionCachePreWarm(final byte[] tableName);
}
public class HTable implements HTableInterface {
// ...
//
private boolean isRegionCachePreWarmEnabled; // default true;
public void setRegionCachePreWarm(boolean enable) {
this.isRegionCachePreWarmEnabled = enable;
this.connection.setRegionCachePreWarm(this.tableName,
this.isRegionCachePreWarmEnabled);
}
public boolean getRegionCachePreWarm() {
return this.connection.getRegionCachePreWarm(this.tableName);
}
}
HTable t1 = new HTable("foo");
t1.setRegionCachePreWarm(false);
{code}
2) Prefetch a certain number of regions locations on cache miss when performing
location lookup. (This piece has been implemented in the previous patch except
configurable readahead number)
{code}
// HConnectionManager.TableServer
this.preFetchRegionLimit = conf.getInt("hbase.client.prefetch.limit", 10);
{code}
3) Serialize META to disk, put it in the DistributedCache, and then prewarm the
meta cache from there.
{code}
// getRegionsInfo does not update the region location cache for the table
// we don't want to change that
Map<HRegionInfo,HServerAddress> regionMap = table.getRegionsInfo();
// added:
table.serizlizeRegionInfo(regionMap);
{code}
{code}
// added: deserialize region info from DC
regionMap = table.deserizlizeRegionInfo();
// added: regionMap will be set to HConnectionManager region cache.
table.setRegionsInfo(regionMap);
{code}
Please help to review and provide your feedback. If there is no objection, I
will provide a new patch based on the proposal.
> Improvements to prewarm META cache on clients
> ---------------------------------------------
>
> Key: HBASE-2468
> URL: https://issues.apache.org/jira/browse/HBASE-2468
> Project: HBase
> Issue Type: Improvement
> Components: client
> Reporter: Todd Lipcon
> Assignee: Mingjie Lai
> Fix For: 0.21.0
>
> Attachments: HBASE-2468-trunk.patch
>
>
> A couple different use cases cause storms of reads to META during startup.
> For example, a large MR job will cause each map task to hit meta since it
> starts with an empty cache.
> A couple possible improvements have been proposed:
> - MR jobs could ship a copy of META for the table in the DistributedCache
> - Clients could prewarm cache by doing a large scan of all the meta for the
> table instead of random reads for each miss
> - Each miss could fetch ahead some number of rows in META
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.