Mikhail Khludnev created LUCENE-5052:
----------------------------------------

             Summary: bitset codec for off heap filters
                 Key: LUCENE-5052
                 URL: https://issues.apache.org/jira/browse/LUCENE-5052
             Project: Lucene - Core
          Issue Type: New Feature
          Components: core/codecs
            Reporter: Mikhail Khludnev
             Fix For: 5.0


Colleagues,

When we filter we don’t care any of scoring factors i.e. norms, positions, tf, 
but it should be fast. The obvious way to handle this is to decode postings 
list and cache it in heap (CachingWrappingFilter, Solr’s DocSet). Both of 
consuming a heap and decoding as well are expensive. 
Let’s write a posting list as a bitset, if df is greater than segment's 
maxdocs/8  (what about skiplists? and overall performance?). 
Beside of the codec implementation, the trickiest part to me is to design API 
for this. How we can let the app know that a term query don’t need to be cached 
in heap, but can be held as an mmaped bitset?

WDYT?  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to