[
https://issues.apache.org/jira/browse/SOLR-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15517090#comment-15517090
]
Ben Manes commented on SOLR-8241:
---------------------------------
Expiration is tricky because it means the data is no longer valid to be
consumed and should not be consumed. The middle ground here is to
refreshAfterWrite, which serves stale entries and tries to asynchronously
reload the value. That covers the common case by not penalizing active entries
by evicting, while letting inactive ones expire.
That probably isn't enough and its impossible to cover all use-cases. So
instead its more of a data structure to (hopefully) be malleable to have custom
workarounds. The CacheWriter can be used to create a victim cache, which a
CacheLoader could retrieve from. So you could let expired entries populate the
victim and be promoted back into the cache, sometimes within the same atomic
operation. Then a rewarming could clear the victim when its done as its
contents are unnecessary. So something like this is might be workable.
> Evaluate W-TinyLfu cache
> ------------------------
>
> Key: SOLR-8241
> URL: https://issues.apache.org/jira/browse/SOLR-8241
> Project: Solr
> Issue Type: Wish
> Components: search
> Reporter: Ben Manes
> Priority: Minor
> Attachments: SOLR-8241.patch
>
>
> SOLR-2906 introduced an LFU cache and in-progress SOLR-3393 makes it O(1).
> The discussions seem to indicate that the higher hit rate (vs LRU) is offset
> by the slower performance of the implementation. An original goal appeared to
> be to introduce ARC, a patented algorithm that uses ghost entries to retain
> history information.
> My analysis of Window TinyLfu indicates that it may be a better option. It
> uses a frequency sketch to compactly estimate an entry's popularity. It uses
> LRU to capture recency and operate in O(1) time. When using available
> academic traces the policy provides a near optimal hit rate regardless of the
> workload.
> I'm getting ready to release the policy in Caffeine, which Solr already has a
> dependency on. But, the code is fairly straightforward and a port into Solr's
> caches instead is a pragmatic alternative. More interesting is what the
> impact would be in Solr's workloads and feedback on the policy's design.
> https://github.com/ben-manes/caffeine/wiki/Efficiency
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]