[
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244357#comment-15244357
]
Ben Manes commented on CASSANDRA-11452:
---------------------------------------
CLHM was always a decorator, but in 1.4 it embedded the CHMv8 backport. We did
that to help improve performance for very large caches, like Cassandra's were,
since JDK8 took a long time. That's probably what your remembering.
I agree that reducing per-entry overhead is attractive, though a [rough
calculation|https://github.com/ben-manes/caffeine/wiki/Memory-overhead]
indicates it isn't a huge savings. My view is that it is a premature
optimization and best left to the end after the implementation has matured, to
re-evaluate if the impact is worth attempting a direct rewrite. Otherwise it
adds greatly to the complexity budget from the get go and leading to less time
focused on the unique problems of the domain (API, features, efficiency). For
example there is more space savings by using TinyLFU over LIRS's ghost entries,
but evaluating took effort that I might have been to overwhelmed to expend. It
would also be interesting to see if pairing with [Apache
Mnemonic|https://github.com/apache/incubator-mnemonic] could reduce the GC
overhead by having off-heap without the serialization penalty.
bq. Just to clarify those numbers are for small workloads?
Yep.
bq ...it would still leave the gate open for an attacker to reduce the efficacy
of the cache for items that have only moderate reuse likelihood.
Since the frequency is reduced by half every sample period, my assumption was
that this attack would be very difficult. Gil's response was to instead detect
if TinyLFU had a large number of consecutive rejections, e.g. 80 (assuming 1:20
is admitted on average). That worked quite well, except on ARC's database trace
(ds1) which had a negative impact. It makes sense that scans (db, analytics)
will have a high rejection rate. What do you think about combining the
approach, e.g. {{(candidateFreq <= 3) || (++unadmittedItems < 80)}}, as a guard
prior to performing a 1% random admittance?
> Cache implementation using LIRS eviction for in-process page cache
> ------------------------------------------------------------------
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
> Issue Type: Improvement
> Components: Local Write-Read Paths
> Reporter: Branimir Lambov
> Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid
> having to explicitly marking compaction accesses as non-cacheable, we need a
> cache implementation that uses an eviction algorithm that can better handle
> non-recurring accesses.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)