ben-manes commented on a change in pull request #3215:
URL: https://github.com/apache/hbase/pull/3215#discussion_r638239220



##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java
##########
@@ -158,7 +158,13 @@ public boolean containsBlock(BlockCacheKey cacheKey) {
   @Override
   public Cacheable getBlock(BlockCacheKey cacheKey,
       boolean caching, boolean repeat, boolean updateCacheMetrics) {
-    Cacheable value = cache.getIfPresent(cacheKey);
+    Cacheable value = cache.asMap().computeIfPresent(cacheKey, (blockCacheKey, 
cacheable) -> {
+      // It will be referenced by RPC path, so increase here. NOTICE: Must do 
the retain inside
+      // this block. because if retain outside the map#computeIfPresent, the 
evictBlock may remove
+      // the block and release, then we're retaining a block with refCnt=0 
which is disallowed.
+      cacheable.retain();
+      return cacheable;
+    });

Review comment:
       Thanks @saintstack. This was from my analysis when contributing the 
original patch.
   
   Zipfian is wonderful for a perf benchmark by stressing locks, etc. to find 
bottlenecks, but isn't realistic for actual production performance. I'm not 
sure if there is a great approach other than network record/replay or 
canarying. 
   
   If you have a workload trace we can try to simulate that, where the hit 
rates should be better (e.g. [database 
trace](https://github.com/ben-manes/caffeine/wiki/Efficiency#database). That 
wouldn't show actual system behavior, just the cache's expected hit rates in 
isolation. HBase's LRU is similar-ish to SLRU, so then ARC might be a good 
upper bound of expectations.
   
   Between zipfian benchmark and trace simulations, we can get a roughish idea 
of if there is a benefit. Otherwise canarying is the best that I've seen so 
far, which is a bit heavy handed but simple.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to