Adrien Grand created LUCENE-6077:
------------------------------------
Summary: Add a filter cache
Key: LUCENE-6077
URL: https://issues.apache.org/jira/browse/LUCENE-6077
Project: Lucene - Core
Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
Fix For: 5.0
Lucene already has filter caching abilities through CachingWrapperFilter, but
CachingWrapperFilter requires you to know which filters you want to cache
up-front.
Caching filters is not trivial. If you cache too aggressively, then you slow
things down since you need to iterate over all documents that match the filter
in order to load it into an in-memory cacheable DocIdSet. On the other hand, if
you don't cache at all, you are potentially missing interesting speed-ups on
frequently-used filters.
Something that would be nice would be to have a generic filter cache that would
track usage for individual filters and make the decision to cache or not a
filter on a given segments based on usage statistics and various heuristics,
such as:
- the overhead to cache the filter (for instance some filters produce
DocIdSets that are already cacheable)
- the cost to build the DocIdSet (the getDocIdSet method is very expensive on
some filters such as MultiTermQueryWrapperFilter that potentially need to merge
lots of postings lists)
- the segment we are searching on (flush segments will likely be merged right
away so it's probably not worth building a cache on such segments)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]