Hello, Scott. I've found such straightforward implementation https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L512 and a more space efficient one https://github.com/apache/lucene/blob/d6dbe4374a5229b827613b85066f3a4da91d5f27/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L531 If you use those snippents, you need to bother about segmentation yourself. And here is an utility https://github.com/apache/lucene/blob/main/lucene/join/src/java/org/apache/lucene/search/join/QueryBitSetProducer.java this one produces top-level bitset over all segments.
On Tue, Jul 12, 2022 at 12:44 AM Scotter <scottro...@gmail.com> wrote: > Hi there, > Hopefully this is the right audience for my question. I'm a developer > working on an effort to upgrade our Java app from Lucene 5 to Lucene 8 (or > later). While doing investigation into changes in these versions the main > thing that I'm struggling with is how to replace our current usage of > org.apache.lucene.search.Filter as we use this class pretty heavily, and > this class was previously deprecated and has been removed after version 5. > > I've looked at the migration guide for Lucene 6 > <https://lucene.apache.org/core/6_5_1/MIGRATE.html> and javadocs and I'm > just not understanding the intended path to migrate away from using Filter > and FilteredQuery. In the migration guide I see: > Removal of Filter and FilteredQuery (LUCENE-6301 > <https://issues.apache.org/jira/browse/LUCENE-6301>,LUCENE-6583 > <https://issues.apache.org/jira/browse/LUCENE-6583>) > > Filter and FilteredQuery have been removed. Regular queries can be used > instead of filters as they have been optimized for the filtering case. And > you can construct a BooleanQuery with one MUST clause for the query, and > one FILTER clause for the filter in order to have similar behaviour to > FilteredQuery. > > It is my understanding that in older versions of Lucene filters were > similar to queries, except they didn't participate in scoring, and I can > see now how to generate a query that doesn't apply to scoring by using a > BooleanQuery with the BooleanClause.Occur.FILTER option to effectively get > the same behavior. So that makes sense. > > But another capability of the Filter class was the ability to provide a > DocIdSet to indicate which documents should be permitted in search results > > Filter.getDocIdSet() : > < > https://lucene.apache.org/core/5_5_0/core/org/apache/lucene/search/Filter.html#getDocIdSet(org.apache.lucene.index.LeafReaderContext,%20org.apache.lucene.util.Bits) > > > > public abstract DocIdSet getDocIdSet(LeafReaderContext context, Bits > acceptDocs) throws IOException; > > > We use this low-level capability to do filtering of documents based on > BitSets we populate in application code and then convert to DocIdSets when > running Lucene queries in certain contexts. We have an extension of the > Filter class that does exactly this, and it's pretty straighforward. Now > that Filter has been removed, is there a suggested Query implementation to > use that would provide similar behavior? I've looked at the implementations > of the Query class mentioned here: > > https://lucene.apache.org/core/8_0_0/core/org/apache/lucene/search/Query.html > but I'm not seeing any that accept a DocIdSet or BitSet, or would be > relevant to the use case I described above. I've also looked at stack > overflow and other forums online to get insight into this problem but to no > avail. > If there's no existing Query implementation relevant to this use case, > would you suggest I write my own Query implementation similar to the old > FilteredQuery? Or might there be a better way to go about solving this > problem that scales better and is performant? We basically want to apply a > BitSet filter to every Lucene query that a user performs in certain > contexts. We have the ability to quickly and easily populate BitSet > instances representing all of the Lucene Doc IDs in the index, with > the bits turned on for those documents we want to be included in search > results. > > If this has already been answered in a forum post, I apologize. Or if > there's a Lucene specific forum somewhere I could look at, if you could > kindly point me there, I would appreciate it. > > Any help/insight is greatly appreciated. > > Thanks, > Scott Robey > -- Sincerely yours Mikhail Khludnev