[ https://issues.apache.org/jira/browse/LUCENE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13707889#comment-13707889 ]
Robert Muir commented on LUCENE-5101: ------------------------------------- {quote} Maybe we could use these numbers to have better defaults in CWF? (and only use FixedBitSet for dense sets for example) {quote} +1: we should have better defaults. Ideally we would use DISI.cost() to estimate the sparsity. One problem is a lot of the costly filters that people want to cache have a crap cost() implementation. e.g. MultiTermQueryWrapperFilter could instead getAndSet() and return a DISI with an actual accurate cost(). Or instead for now, we could also check firstDocID too... > make it easier to plugin different bitset implementations to > CachingWrapperFilter > --------------------------------------------------------------------------------- > > Key: LUCENE-5101 > URL: https://issues.apache.org/jira/browse/LUCENE-5101 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Robert Muir > > Currently this is possible, but its not so friendly: > {code} > protected DocIdSet docIdSetToCache(DocIdSet docIdSet, AtomicReader reader) > throws IOException { > if (docIdSet == null) { > // this is better than returning null, as the nonnull result can be > cached > return EMPTY_DOCIDSET; > } else if (docIdSet.isCacheable()) { > return docIdSet; > } else { > final DocIdSetIterator it = docIdSet.iterator(); > // null is allowed to be returned by iterator(), > // in this case we wrap with the sentinel set, > // which is cacheable. > if (it == null) { > return EMPTY_DOCIDSET; > } else { > /* INTERESTING PART */ > final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); > bits.or(it); > return bits; > /* END INTERESTING PART */ > } > } > } > {code} > Is there any value to having all this other logic in the protected API? It > seems like something thats not useful for a subclass... Maybe this stuff can > become final, and "INTERESTING PART" calls a simpler method, something like: > {code} > protected DocIdSet cacheImpl(DocIdSetIterator iterator, AtomicReader reader) { > final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); > bits.or(iterator); > return bits; > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org