[jira] [Commented] (HBASE-6383) Investigate using 2Q for block cache

Jean-Daniel Cryans (JIRA) Mon, 16 Jul 2012 12:16:38 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-6383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415548#comment-13415548
 ]


Jean-Daniel Cryans commented on HBASE-6383:
-------------------------------------------

I have a barebones 2Q implementation that I'm pasting here. It often performs 
better than our "LRU" even when I tweaked it to use all the available memory 
instead of the default max 85%. It's main deficiency is that it can double 
cache a block (just like the DoubleBlockCache).

I also added a ghost cache to our LRU to see how it's performing and it seems 
to be worse most of the time but not by a few (eg less than 0.010%).

{code}
  /**
   * This 2Q implementation uses 3 CacheBuilder-generated maps to hold am, aIn 
and aOut
   * It uses the default sizes from the paper: 75% for am, 25% for aIn and 50% 
for
   * the ghost cache aOut.
   * am and aIn use an eviction listener in order to move their block keys to 
aOut
   */
  static class TwoQueueCache implements BlockCache {

    final CacheStats cacheStats = new CacheStats();
    final ConcurrentMap<BlockCacheKey, Cacheable> am;
    final ConcurrentMap<BlockCacheKey, Cacheable> aIn;
    final ConcurrentMap<BlockCacheKey, BlockCacheKey> aOut;

    TwoQueueCache(long size) {
      am = CacheBuilder.newBuilder().removalListener(new 
EvictionListener()).maximumWeight((long) (size * 0.75)).weigher(new 
CacheWeigher()).<BlockCacheKey, Cacheable>build().asMap();
      aIn = CacheBuilder.newBuilder().removalListener(new 
EvictionListener()).maximumWeight((long)(size * 0.25)).weigher(new 
CacheWeigher()).<BlockCacheKey, Cacheable>build().asMap();
      aOut = CacheBuilder.newBuilder().maximumWeight((long)(size * 
0.25)).weigher(new DummyWeigher()).<BlockCacheKey, 
BlockCacheKey>build().asMap();
    }

    @Override
    public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
inMemory) {
      Object aOutKey = aOut.remove(cacheKey);
      Object aInKey = aOut.remove(cacheKey);
      if(aOutKey != null || aInKey != null || inMemory) am.put(cacheKey, buf);
      else aIn.put(cacheKey, buf);
    }

    @Override
    public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf) {
      this.cacheBlock(cacheKey, buf, false);
    }

    @Override
    public Cacheable getBlock(BlockCacheKey cacheKey, boolean caching) {
      Cacheable block = am.get(cacheKey);
      if (block == null) {
        block = aIn.remove(cacheKey);
        if (block == null) cacheStats.miss(caching);
        else {
          cacheStats.hit(caching);
          am.put(cacheKey, block);
        }
      } else cacheStats.hit(caching);

      return block;
    }

    @Override
    public boolean evictBlock(BlockCacheKey cacheKey) {return false; }

    @Override
    public int evictBlocksByHfileName(String hfileName) {return 0; }

    @Override
    public CacheStats getStats() {
      return cacheStats;
    }

    @Override
    public void shutdown() {  }

    @Override
    public long size() { return 0; }

    @Override
    public long getFreeSize() { return 0; }

    @Override
    public long getCurrentSize() { return 0; }

    @Override
    public long getEvictedCount() { return 0; }

    @Override
    public long getBlockCount() { return 0; }

    @Override
    public List<BlockCacheColumnFamilySummary> 
getBlockCacheColumnFamilySummaries(Configuration conf) throws IOException {
      return null;
    }

    class EvictionListener implements RemovalListener<BlockCacheKey, Cacheable> 
{

      @Override
      public void onRemoval(RemovalNotification<BlockCacheKey, Cacheable> 
notification) {
        if (!notification.wasEvicted()) {
          return;
        }
        aOut.put(notification.getKey(), notification.getKey());
      }
    }
  }
{code}
                
> Investigate using 2Q for block cache
> ------------------------------------
>
>                 Key: HBASE-6383
>                 URL: https://issues.apache.org/jira/browse/HBASE-6383
>             Project: HBase
>          Issue Type: New Feature
>          Components: performance, regionserver
>    Affects Versions: 0.96.0
>            Reporter: Jesse Yates
>            Priority: Minor
>
> Currently we use a basic version of LRU to handle block caching. LRU is know 
> to be very susceptible to scan thrashing (not scan resistant), which is a 
> common operation in HBase. 2Q is an efficient caching algorithm that emulates 
> the effectivness of LRU/2 (eviction based not on the last access, but rather 
> the access before the last), but is O(1), rather than O(lg\(n)) in complexity.
> JD has long been talking about investigating 2Q as it may be far better for 
> HBase than LRU and has been shown to be incredibly useful for traditional 
> database caching on production systems.
> One would need to implement 2Q (though the pseudocode in the paper is quite 
> explicit) and then test against the existing cache implementation.
> The link to the original paper is here: www.vldb.org/conf/1994/P439.PDF
> A short overview of 2Q:
> 2Q uses two queues (hence the name) and a list of pointers to keep track of 
> cached blocks. The first queue is for new, hot items (Ain). If an item is 
> accessed that isn't in Ain, the coldest block is evicted from Ain and the new 
> item replaces it. Anything accessed in Ain is already stored in memory and 
> kept in Ain.
> When a block is evicted from Ain, it is moved to Aout _as a pointer_. If Aout 
> is full, the oldest element is evicted and replaced with the new pointer.
> The key to 2Q comes in that when you access something in Aout, it is reloaded 
> into memory and stored in queue B. If B becomes full, then the coldest block 
> is evicted. 
> This essentially makes Aout a filter for long-term hot items, based on the 
> size of Aout. The original authors found that while you can tune Aout, it 
> generally performs very well at at "50% of the number of pages as would fit 
> into the buffer", but can be tuned as low as 5% at only a slight cost to 
> responsiveness to changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6383) Investigate using 2Q for block cache

Reply via email to