[
https://issues.apache.org/jira/browse/LUCENE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226284#comment-14226284
]
Robert Muir commented on LUCENE-6077:
-------------------------------------
This looks great!
Do we really need to default CachingWrapperFilter to a "stupid" policy?
Is there a better name for FilterCache.cache() method? it can be a noun or a
verb, so its kind of confusing. Maybe doCache would be better?
CachingWrapperFilter's new ctor: can we fix the typo?
FilterCachingPolicy.onCache, can we correct the param name?
> Add a filter cache
> ------------------
>
> Key: LUCENE-6077
> URL: https://issues.apache.org/jira/browse/LUCENE-6077
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Assignee: Adrien Grand
> Priority: Minor
> Fix For: 5.0
>
> Attachments: LUCENE-6077.patch
>
>
> Lucene already has filter caching abilities through CachingWrapperFilter, but
> CachingWrapperFilter requires you to know which filters you want to cache
> up-front.
> Caching filters is not trivial. If you cache too aggressively, then you slow
> things down since you need to iterate over all documents that match the
> filter in order to load it into an in-memory cacheable DocIdSet. On the other
> hand, if you don't cache at all, you are potentially missing interesting
> speed-ups on frequently-used filters.
> Something that would be nice would be to have a generic filter cache that
> would track usage for individual filters and make the decision to cache or
> not a filter on a given segments based on usage statistics and various
> heuristics, such as:
> - the overhead to cache the filter (for instance some filters produce
> DocIdSets that are already cacheable)
> - the cost to build the DocIdSet (the getDocIdSet method is very expensive
> on some filters such as MultiTermQueryWrapperFilter that potentially need to
> merge lots of postings lists)
> - the segment we are searching on (flush segments will likely be merged
> right away so it's probably not worth building a cache on such segments)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]