[
https://issues.apache.org/jira/browse/HBASE-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15525123#comment-15525123
]
Sean Busbey commented on HBASE-15560:
-------------------------------------
the patch failures are because the caffeine library improperly refers to the
ALv2:
{code}
<licenses>
<license>
<name>The Apache Software License, Version 2.0</name>
<url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>
<distribution>repo</distribution>
</license>
</licenses>
{code}
The correct name is "Apache License, Version 2.0". You should update the
supplemental license information; there are a bunch of examples there from ASF
projects that used the wrong name for years. the file is
{{hbase-resource-bundle/src/main/resources/supplemental-models.xml}}
> TinyLFU-based BlockCache
> ------------------------
>
> Key: HBASE-15560
> URL: https://issues.apache.org/jira/browse/HBASE-15560
> Project: HBase
> Issue Type: Improvement
> Components: BlockCache
> Affects Versions: 2.0.0
> Reporter: Ben Manes
> Assignee: Ben Manes
> Attachments: HBASE-15560.patch, HBASE-15560.patch, tinylfu.patch
>
>
> LruBlockCache uses the Segmented LRU (SLRU) policy to capture frequency and
> recency of the working set. It achieves concurrency by using an O( n )
> background thread to prioritize the entries and evict. Accessing an entry is
> O(1) by a hash table lookup, recording its logical access time, and setting a
> frequency flag. A write is performed in O(1) time by updating the hash table
> and triggering an async eviction thread. This provides ideal concurrency and
> minimizes the latencies by penalizing the thread instead of the caller.
> However the policy does not age the frequencies and may not be resilient to
> various workload patterns.
> W-TinyLFU ([research paper|http://arxiv.org/pdf/1512.00727.pdf]) records the
> frequency in a counting sketch, ages periodically by halving the counters,
> and orders entries by SLRU. An entry is discarded by comparing the frequency
> of the new arrival (candidate) to the SLRU's victim, and keeping the one with
> the highest frequency. This allows the operations to be performed in O(1)
> time and, though the use of a compact sketch, a much larger history is
> retained beyond the current working set. In a variety of real world traces
> the policy had [near optimal hit
> rates|https://github.com/ben-manes/caffeine/wiki/Efficiency].
> Concurrency is achieved by buffering and replaying the operations, similar to
> a write-ahead log. A read is recorded into a striped ring buffer and writes
> to a queue. The operations are applied in batches under a try-lock by an
> asynchronous thread, thereby track the usage pattern without incurring high
> latencies
> ([benchmarks|https://github.com/ben-manes/caffeine/wiki/Benchmarks#server-class]).
> In YCSB benchmarks the results were inconclusive. For a large cache (99% hit
> rates) the two caches have near identical throughput and latencies with
> LruBlockCache narrowly winning. At medium and small caches, TinyLFU had a
> 1-4% hit rate improvement and therefore lower latencies. The lack luster
> result is because a synthetic Zipfian distribution is used, which SLRU
> performs optimally. In a more varied, real-world workload we'd expect to see
> improvements by being able to make smarter predictions.
> The provided patch implements BlockCache using the
> [Caffeine|https://github.com/ben-manes/caffeine] caching library (see
> HighScalability
> [article|http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html]).
> Edward Bortnikov and Eshcar Hillel have graciously provided guidance for
> evaluating this patch ([github
> branch|https://github.com/ben-manes/hbase/tree/tinylfu]).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)