[
https://issues.apache.org/jira/browse/SOLR-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ben Manes updated SOLR-8241:
----------------------------
Attachment: proposal.patch
I have some basic tests ported (testSimple, testTimeDecay). The first performs
access operations and the second ensures frequency is taken into account. The
changes also adds cumulative stats by aggregating during warm() (this was
simpler than the init approach since Caffeine's stats object is immutable).
Minor changes are to rename the class to TinyLfuCache to emphasize the policy
over the library. That conforms with the HBase and Accumulo integration, and
matches the existing naming convention.
This version of the patch requires changes in Caffeine 2.3.4-SNAPSHOT. I
improved the hot iteration order which previously returned in warm, hot, cold
order. Given real world cache sizes it might not have made a difference, but
was a required improvement for the tests. So I'm adding this version as a
proposal and can cut a release when you're ready for integration.
> Evaluate W-TinyLfu cache
> ------------------------
>
> Key: SOLR-8241
> URL: https://issues.apache.org/jira/browse/SOLR-8241
> Project: Solr
> Issue Type: Wish
> Components: search
> Reporter: Ben Manes
> Priority: Minor
> Attachments: SOLR-8241.patch, SOLR-8241.patch, proposal.patch
>
>
> SOLR-2906 introduced an LFU cache and in-progress SOLR-3393 makes it O(1).
> The discussions seem to indicate that the higher hit rate (vs LRU) is offset
> by the slower performance of the implementation. An original goal appeared to
> be to introduce ARC, a patented algorithm that uses ghost entries to retain
> history information.
> My analysis of Window TinyLfu indicates that it may be a better option. It
> uses a frequency sketch to compactly estimate an entry's popularity. It uses
> LRU to capture recency and operate in O(1) time. When using available
> academic traces the policy provides a near optimal hit rate regardless of the
> workload.
> I'm getting ready to release the policy in Caffeine, which Solr already has a
> dependency on. But, the code is fairly straightforward and a port into Solr's
> caches instead is a pragmatic alternative. More interesting is what the
> impact would be in Solr's workloads and feedback on the policy's design.
> https://github.com/ben-manes/caffeine/wiki/Efficiency
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]