[
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244390#comment-15244390
]
Benedict commented on CASSANDRA-11452:
--------------------------------------
bq. though a rough calculation indicates it isn't a huge savings
Assuming a basic CLHM, with CompressedOops on a 64-bit VM (Cassandra's
defaults) I calculate overhead inflation of around 22% - I reckon 72 bytes are
needed vs a possible 56 (once the 12 byte overheads are removed, and alignment
accounted for). You'd also be able to avoid recalculating the hash for the
sketches since its memoized in CHM. Admittedly I don't 100% vouch for the
accuracy of those calculations as I'm doing it from memory.
I absolutely am not suggesting your calculation of cost/benefit is wrong
though, or that I would even have arrived at a different conclusion. Certainly
the user key/value sizes further amortize that overhead inflation, and for many
workloads the distinction is barely perceptible.
bq. What do you think about combining the approach
I assume you mean the inversion of that guard. It's a shame we don't have
access to the CHM to do the sampling, as that would make it robust to scans
since all the members of the LRU would have high frequencies. My only slight
concern is that we may have to wait 10s of thousands of rejections to cycle out
the collision, which is quite slow to respond. By raising the chance we harm
scans though. A couple of other options:
# Randomly sample the frequency of, say, 1% of the items we admit (on
admission, storing the last 16 or so), on demand compute the low quartile
# On demand, sample a random short run of the sketch when we encounter this
situation, compute some percentile (need some thought for which)
Then either for 1% of admissions, or when your current guard is triggered,
compute this statistic for the guard. For absolute security, for say 0.01% of
candidates, admit without any check.
That all said, I expect for Cassandra's purposes many of the proposed solutions
so far will be sufficient, and I certainly wouldn't have any problem with the
solution you propose.
> Cache implementation using LIRS eviction for in-process page cache
> ------------------------------------------------------------------
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
> Issue Type: Improvement
> Components: Local Write-Read Paths
> Reporter: Branimir Lambov
> Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid
> having to explicitly marking compaction accesses as non-cacheable, we need a
> cache implementation that uses an eviction algorithm that can better handle
> non-recurring accesses.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)