[jira] [Commented] (CASSANDRA-10855) Use Caffeine (W-TinyLFU) for on-heap caches

Ben Manes (JIRA) Wed, 06 Jan 2016 12:39:07 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086242#comment-15086242
 ]


Ben Manes commented on CASSANDRA-10855:
---------------------------------------

Happy new year. Anything I can do to help keep this moving?

Ariel's comment explains the poor hit rate, as a uniform distribution will 
result in a fixed and low hit rate regardless of policy. An effective cache is 
often at around 85%, ideally in the high 90s to make reads the dominant case, 
but even 65% is useful. Even when the hit rate is maxed out, the effect of a 
better policy can be noticeable. In that case it reduces the TCO by being able 
to achieve the same performance with smaller, cheaper machines.

Glancing at the uniform results the degredation is small enough to probably be 
within the margin of error where the run and other system effects dominate. In 
an update heavy workload the new cache should be faster due to synchronization 
having less penalty than CAS storms. But on the perf test's insertion heavy 
workload it is probably a little slower due to features incurring more 
complexity. Another set of eyes might uncover some improvements, so that's 
always welcome.

[Zipf-like|http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.12.2253&rank=1]
 distributions are considered the most common workload patterns. Ideally we 
could capture a production trace and simulate it, as the [database 
trace|https://github.com/ben-manes/caffeine/wiki/Efficiency#database] I use 
shows very promising results.

> Use Caffeine (W-TinyLFU) for on-heap caches
> -------------------------------------------
>
>                 Key: CASSANDRA-10855
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10855
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Ben Manes
>              Labels: performance
>
> Cassandra currently uses 
> [ConcurrentLinkedHashMap|https://code.google.com/p/concurrentlinkedhashmap] 
> for performance critical caches (key, counter) and Guava's cache for 
> non-critical (auth, metrics, security). All of these usages have been 
> replaced by [Caffeine|https://github.com/ben-manes/caffeine], written by the 
> author of the previously mentioned libraries.
> The primary incentive is to switch from LRU policy to W-TinyLFU, which 
> provides [near optimal|https://github.com/ben-manes/caffeine/wiki/Efficiency] 
> hit rates. It performs particularly well in database and search traces, is 
> scan resistant, and as adds a very small time/space overhead to LRU.
> Secondarily, Guava's caches never obtained similar 
> [performance|https://github.com/ben-manes/caffeine/wiki/Benchmarks] to CLHM 
> due to some optimizations not being ported over. This change results in 
> faster reads and not creating garbage as a side-effect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10855) Use Caffeine (W-TinyLFU) for on-heap caches

Reply via email to