[ 
https://issues.apache.org/jira/browse/CASSANDRA-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117212#comment-15117212
 ] 

Benedict commented on CASSANDRA-10855:
--------------------------------------

Zipf could easily be added.  I added Weibull as the extreme value distribution 
without thinking too much about things at the time, and still don't really 
understand (nor have the time to) probability distributions.  Both are used to 
model both incidence and sizes, but in differing circumstances.  Zipf 
originates from language, but Weibull distribution apparently better models 
iconographic languages!

It does seem that the literature standardizes on Zipf for incidence for things 
like access patterns.  I'm pretty sure Weibull would also be fine, but it is 
perhaps better suited for generating extreme values (such as size of payload), 
but even this could happily be dealt with by Zipf, so perhaps we should just 
switch out Weibull entirely.


> Use Caffeine (W-TinyLFU) for on-heap caches
> -------------------------------------------
>
>                 Key: CASSANDRA-10855
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10855
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Ben Manes
>              Labels: performance
>
> Cassandra currently uses 
> [ConcurrentLinkedHashMap|https://code.google.com/p/concurrentlinkedhashmap] 
> for performance critical caches (key, counter) and Guava's cache for 
> non-critical (auth, metrics, security). All of these usages have been 
> replaced by [Caffeine|https://github.com/ben-manes/caffeine], written by the 
> author of the previously mentioned libraries.
> The primary incentive is to switch from LRU policy to W-TinyLFU, which 
> provides [near optimal|https://github.com/ben-manes/caffeine/wiki/Efficiency] 
> hit rates. It performs particularly well in database and search traces, is 
> scan resistant, and as adds a very small time/space overhead to LRU.
> Secondarily, Guava's caches never obtained similar 
> [performance|https://github.com/ben-manes/caffeine/wiki/Benchmarks] to CLHM 
> due to some optimizations not being ported over. This change results in 
> faster reads and not creating garbage as a side-effect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to