[ 
https://issues.apache.org/jira/browse/LUCENE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13707268#comment-13707268
 ] 

Adrien Grand commented on LUCENE-5101:
--------------------------------------

A quick note about alternative DocIdSet we now have. I wrote a benchmark 
(attached) to see how they compared to FixedBitSet, you can look at the results 
here: http://people.apache.org/~jpountz/doc_id_sets.html Please note that 
EliasFanoDocIdSet is disadvantaged for advance() since it doesn't have an index 
yet, it will be interesting to run this benchmark again when it gets one.

Maybe we could use these numbers to have better defaults in CWF? (and only use 
FixedBitSet for dense sets for example)
                
> make it easier to plugin different bitset implementations to 
> CachingWrapperFilter
> ---------------------------------------------------------------------------------
>
>                 Key: LUCENE-5101
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5101
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>
> Currently this is possible, but its not so friendly:
> {code}
>   protected DocIdSet docIdSetToCache(DocIdSet docIdSet, AtomicReader reader) 
> throws IOException {
>     if (docIdSet == null) {
>       // this is better than returning null, as the nonnull result can be 
> cached
>       return EMPTY_DOCIDSET;
>     } else if (docIdSet.isCacheable()) {
>       return docIdSet;
>     } else {
>       final DocIdSetIterator it = docIdSet.iterator();
>       // null is allowed to be returned by iterator(),
>       // in this case we wrap with the sentinel set,
>       // which is cacheable.
>       if (it == null) {
>         return EMPTY_DOCIDSET;
>       } else {
> /* INTERESTING PART */
>         final FixedBitSet bits = new FixedBitSet(reader.maxDoc());
>         bits.or(it);
>         return bits;
> /* END INTERESTING PART */
>       }
>     }
>   }
> {code}
> Is there any value to having all this other logic in the protected API? It 
> seems like something thats not useful for a subclass... Maybe this stuff can 
> become final, and "INTERESTING PART" calls a simpler method, something like:
> {code}
> protected DocIdSet cacheImpl(DocIdSetIterator iterator, AtomicReader reader) {
>   final FixedBitSet bits = new FixedBitSet(reader.maxDoc());
>   bits.or(iterator);
>   return bits;
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to