[ 
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244357#comment-15244357
 ] 

Ben Manes commented on CASSANDRA-11452:
---------------------------------------

CLHM was always a decorator, but in 1.4 it embedded the CHMv8 backport. We did 
that to help improve performance for very large caches, like Cassandra's were, 
since JDK8 took a long time. That's probably what your remembering.

I agree that reducing per-entry overhead is attractive, though a [rough 
calculation|https://github.com/ben-manes/caffeine/wiki/Memory-overhead] 
indicates it isn't a huge savings. My view is that it is a premature 
optimization and best left to the end after the implementation has matured, to 
re-evaluate if the impact is worth attempting a direct rewrite. Otherwise it 
adds greatly to the complexity budget from the get go and leading to less time 
focused on the unique problems of the domain (API, features, efficiency). For 
example there is more space savings by using TinyLFU over LIRS's ghost entries, 
but evaluating took effort that I might have been to overwhelmed to expend. It 
would also be interesting to see if pairing with [Apache 
Mnemonic|https://github.com/apache/incubator-mnemonic] could reduce the GC 
overhead by having off-heap without the serialization penalty.

bq. Just to clarify those numbers are for small workloads?

Yep.

bq ...it would still leave the gate open for an attacker to reduce the efficacy 
of the cache for items that have only moderate reuse likelihood.

Since the frequency is reduced by half every sample period, my assumption was 
that this attack would be very difficult. Gil's response was to instead detect 
if TinyLFU had a large number of consecutive rejections, e.g. 80 (assuming 1:20 
is admitted on average). That worked quite well, except on ARC's database trace 
(ds1) which had a negative impact. It makes sense that scans (db, analytics) 
will have a high rejection rate. What do you think about combining the 
approach, e.g. {{(candidateFreq <= 3) || (++unadmittedItems < 80)}}, as a guard 
prior to performing a 1% random admittance?

> Cache implementation using LIRS eviction for in-process page cache
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-11452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths
>            Reporter: Branimir Lambov
>            Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid 
> having to explicitly marking compaction accesses as non-cacheable, we need a 
> cache implementation that uses an eviction algorithm that can better handle 
> non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to