[
https://issues.apache.org/jira/browse/HBASE-6383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415548#comment-13415548
]
Jean-Daniel Cryans commented on HBASE-6383:
-------------------------------------------
I have a barebones 2Q implementation that I'm pasting here. It often performs
better than our "LRU" even when I tweaked it to use all the available memory
instead of the default max 85%. It's main deficiency is that it can double
cache a block (just like the DoubleBlockCache).
I also added a ghost cache to our LRU to see how it's performing and it seems
to be worse most of the time but not by a few (eg less than 0.010%).
{code}
/**
* This 2Q implementation uses 3 CacheBuilder-generated maps to hold am, aIn
and aOut
* It uses the default sizes from the paper: 75% for am, 25% for aIn and 50%
for
* the ghost cache aOut.
* am and aIn use an eviction listener in order to move their block keys to
aOut
*/
static class TwoQueueCache implements BlockCache {
final CacheStats cacheStats = new CacheStats();
final ConcurrentMap<BlockCacheKey, Cacheable> am;
final ConcurrentMap<BlockCacheKey, Cacheable> aIn;
final ConcurrentMap<BlockCacheKey, BlockCacheKey> aOut;
TwoQueueCache(long size) {
am = CacheBuilder.newBuilder().removalListener(new
EvictionListener()).maximumWeight((long) (size * 0.75)).weigher(new
CacheWeigher()).<BlockCacheKey, Cacheable>build().asMap();
aIn = CacheBuilder.newBuilder().removalListener(new
EvictionListener()).maximumWeight((long)(size * 0.25)).weigher(new
CacheWeigher()).<BlockCacheKey, Cacheable>build().asMap();
aOut = CacheBuilder.newBuilder().maximumWeight((long)(size *
0.25)).weigher(new DummyWeigher()).<BlockCacheKey,
BlockCacheKey>build().asMap();
}
@Override
public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean
inMemory) {
Object aOutKey = aOut.remove(cacheKey);
Object aInKey = aOut.remove(cacheKey);
if(aOutKey != null || aInKey != null || inMemory) am.put(cacheKey, buf);
else aIn.put(cacheKey, buf);
}
@Override
public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf) {
this.cacheBlock(cacheKey, buf, false);
}
@Override
public Cacheable getBlock(BlockCacheKey cacheKey, boolean caching) {
Cacheable block = am.get(cacheKey);
if (block == null) {
block = aIn.remove(cacheKey);
if (block == null) cacheStats.miss(caching);
else {
cacheStats.hit(caching);
am.put(cacheKey, block);
}
} else cacheStats.hit(caching);
return block;
}
@Override
public boolean evictBlock(BlockCacheKey cacheKey) {return false; }
@Override
public int evictBlocksByHfileName(String hfileName) {return 0; }
@Override
public CacheStats getStats() {
return cacheStats;
}
@Override
public void shutdown() { }
@Override
public long size() { return 0; }
@Override
public long getFreeSize() { return 0; }
@Override
public long getCurrentSize() { return 0; }
@Override
public long getEvictedCount() { return 0; }
@Override
public long getBlockCount() { return 0; }
@Override
public List<BlockCacheColumnFamilySummary>
getBlockCacheColumnFamilySummaries(Configuration conf) throws IOException {
return null;
}
class EvictionListener implements RemovalListener<BlockCacheKey, Cacheable>
{
@Override
public void onRemoval(RemovalNotification<BlockCacheKey, Cacheable>
notification) {
if (!notification.wasEvicted()) {
return;
}
aOut.put(notification.getKey(), notification.getKey());
}
}
}
{code}
> Investigate using 2Q for block cache
> ------------------------------------
>
> Key: HBASE-6383
> URL: https://issues.apache.org/jira/browse/HBASE-6383
> Project: HBase
> Issue Type: New Feature
> Components: performance, regionserver
> Affects Versions: 0.96.0
> Reporter: Jesse Yates
> Priority: Minor
>
> Currently we use a basic version of LRU to handle block caching. LRU is know
> to be very susceptible to scan thrashing (not scan resistant), which is a
> common operation in HBase. 2Q is an efficient caching algorithm that emulates
> the effectivness of LRU/2 (eviction based not on the last access, but rather
> the access before the last), but is O(1), rather than O(lg\(n)) in complexity.
> JD has long been talking about investigating 2Q as it may be far better for
> HBase than LRU and has been shown to be incredibly useful for traditional
> database caching on production systems.
> One would need to implement 2Q (though the pseudocode in the paper is quite
> explicit) and then test against the existing cache implementation.
> The link to the original paper is here: www.vldb.org/conf/1994/P439.PDF
> A short overview of 2Q:
> 2Q uses two queues (hence the name) and a list of pointers to keep track of
> cached blocks. The first queue is for new, hot items (Ain). If an item is
> accessed that isn't in Ain, the coldest block is evicted from Ain and the new
> item replaces it. Anything accessed in Ain is already stored in memory and
> kept in Ain.
> When a block is evicted from Ain, it is moved to Aout _as a pointer_. If Aout
> is full, the oldest element is evicted and replaced with the new pointer.
> The key to 2Q comes in that when you access something in Aout, it is reloaded
> into memory and stored in queue B. If B becomes full, then the coldest block
> is evicted.
> This essentially makes Aout a filter for long-term hot items, based on the
> size of Aout. The original authors found that while you can tune Aout, it
> generally performs very well at at "50% of the number of pages as would fit
> into the buffer", but can be tuned as low as 5% at only a slight cost to
> responsiveness to changes.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira