[ 
https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264470#comment-13264470
 ] 

Zhihong Yu commented on HBASE-5898:
-----------------------------------

Consider the case where off heap cache is enabled.
>From DoubleBlockCache:
Suppose getBlock() is executed without the lock (first pass in the new loop of 
readBlock) and doesn't find cacheKey from onHeapCache but finds it in 
offHeapCache - it will call onHeapCache.cacheBlock():
{code}
  public Cacheable getBlock(BlockCacheKey cacheKey, boolean caching) {
    Cacheable cachedBlock;

    if ((cachedBlock = onHeapCache.getBlock(cacheKey, caching)) != null) {
      stats.hit(caching);
      return cachedBlock;

    } else if ((cachedBlock = offHeapCache.getBlock(cacheKey, caching)) != 
null) {
      if (caching) {
        onHeapCache.cacheBlock(cacheKey, cachedBlock);
      }
{code}
Another thread calls cacheBlock() around the same time and executes 
onHeapCache.cacheBlock() for the same cacheKey:
{code}
  public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf) {
    onHeapCache.cacheBlock(cacheKey, buf);
    offHeapCache.cacheBlock(cacheKey, buf);
  }
{code}
I think there is a race condition which didn't exist before the proposed 
change: the entries for the same cacheKey in onHeapCache and offHeapCache would 
diverge.

If off heap cache is disabled, I don't see problem with proposed optimization.
                
> Consider double-checked locking for block cache lock
> ----------------------------------------------------
>
>                 Key: HBASE-5898
>                 URL: https://issues.apache.org/jira/browse/HBASE-5898
>             Project: HBase
>          Issue Type: Improvement
>          Components: performance
>    Affects Versions: 0.94.1
>            Reporter: Todd Lipcon
>         Attachments: hbase-5898.txt
>
>
> Running a workload with a high query rate against a dataset that fits in 
> cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by 
> HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a 
> lot of CPU doing lock management here. I wrote a quick patch to switch to a 
> double-checked locking and it improved throughput substantially for this 
> workload.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to