[
https://issues.apache.org/jira/browse/SOLR-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261197#comment-13261197
]
Hoss Man commented on SOLR-3393:
--------------------------------
bq. I will attempt to make a new O(1) cache called FastLFUCache
{{#OhDearGodPleaseNotAnotherClassWithFastInTheName}}
Please, please, please lets end the madness of subjective adjectives in class
names ... if it's an LFU cache wrapped around a "hawtdb" why don't we just call
it "HawtDbLFUCache" ?
bq. I've been working on this. I've come to realize that I don't completely
understand how CacheRegenerator works. I suspect that it is geared around LRU
caches and that the new cache won't have any of the frequency information from
the old one, it will just put the entries into the cache as if they were new.
Can anyone confirm this?
The idea behind the CacheRegenerator API is to be as simple as possible and
agnostic to:
* the Cache Impl (ie: LRUCache vs LFUCache vs HawtDbLFUCache)
* the cache usage (ie: Query->DocSets vs Query->DocList vs
String->MyCustomClass)
* the means of generating values from keys (ie: how do you know which
MyCustomClass should be cached for which String)
... so you can have a custom (named) cache instance declared in your
solrconfig.xml with your own MySpecialCacheRegenerator that knows about your
usecase and might do something special with the keys/values (like: short-circut
part of the generation if it can see the data hasn't changed, or read from
authoritative data files outside of solr, etc...) and then use *any* Cache impl
class that you're heart desires, and things will still work right.
bq. After the new cache is regenerated, should I go through the new cache, grab
the frequency information from the old cache with each key, and fix the new
cache up?
you certainly could -- when {{(new HawtDbLFUCache(...)).warm(...)}} is called,
it needs to delegate to the regenerator for pulling values from the "old"
cache, but that doesn't mean it can't also directly ask the "old" cache
instance for stats about each of the old keys as it loops over them --
remember: the "new" cache is the one inspecting the "old" cache and deciding
what things to ask the regenerator to generate.
But i question whether you really want any sort of stats from the "old" cache
copied over to the "new" cache. it is, after all, a completely new cache --
with new usage. should the stats really be preserved forever? regardless of
how popular an object was in the "old" cache instance, should we automatically
assume it's equally popular in the "new" cache instance?
> Implement an optimized LFUCache
> -------------------------------
>
> Key: SOLR-3393
> URL: https://issues.apache.org/jira/browse/SOLR-3393
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: 3.6, 4.0
> Reporter: Shawn Heisey
> Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-3393.patch, SOLR-3393.patch
>
>
> SOLR-2906 gave us an inefficient LFU cache modeled on
> FastLRUCache/ConcurrentLRUCache. It could use some serious improvement. The
> following project includes an Apache 2.0 licensed O(1) implementation. The
> second link is the paper (PDF warning) it was based on:
> https://github.com/chirino/hawtdb
> http://dhruvbird.com/lfu.pdf
> Using this project and paper, I will attempt to make a new O(1) cache called
> FastLFUCache that is modeled on LRUCache.java. This will (for now) leave the
> existing LFUCache/ConcurrentLFUCache implementation in place.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]