[
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244406#comment-15244406
]
Ben Manes commented on CASSANDRA-11452:
---------------------------------------
It definitely would be nice to be able to reduce the per-entry cost, have
access to the hash, avoid lambdas by inlining into the code. I kept hoping Doug
would take a stab at it and see what ideas he'd use.
bq. It's a shame we don't have access to the CHM to do the sampling, as that
would make it robust to scans since all the members of the LRU would have high
frequencies.
Sadly Ehcache3 did this for their randomly sampled LRU. CHM has a weak hashing
function due to degrading into red-black trees, but they degraded it further
for speed. This results in -20% hit rate over LRU by taking an MRU-heavy
sample, surprisingly even in large caches. They also made it very slow, taking
minutes instead of a few seconds. I'm now very weary of that idea because it
can be done so horribly if naively handled.
I think for now I'm most comfortable using the following. I think its robust
enough, low cost, and should be hard to exploit (especially for an external
actor). If we discover it is not strong enough, we have a plethora of options
now. :-)
{code:java}
boolean admit(K candidateKey, K victimKey) {
int victimFreq = frequencySketch().frequency(victimKey);
int candidateFreq = frequencySketch().frequency(candidateKey);
if (candidateFreq > victimFreq) {
return true;
} else if (candidateFreq <= 5) {
return false;
}
int random = ThreadLocalRandom.current().nextInt();
return ((random & 127) == 0);
}
{code}
> Cache implementation using LIRS eviction for in-process page cache
> ------------------------------------------------------------------
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
> Issue Type: Improvement
> Components: Local Write-Read Paths
> Reporter: Branimir Lambov
> Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid
> having to explicitly marking compaction accesses as non-cacheable, we need a
> cache implementation that uses an eviction algorithm that can better handle
> non-recurring accesses.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)