Hi, Great, thanks a lot.
Pointing out to RandomAccessWeight and the approach used in DocValuesNumbersQuery is exactly what I need for my use case. I created my own query type that takes advantage of already loaded LongBitSet values. It allows efficiently implementing the Bits that match a document inside my own RandomAccessWeight implementation. This approach is efficient when number of values exceeds a certain threshold. Below that threshold, using TermsQuery is more efficient. I can decide in my code which approach is actually more efficient by applying my specific heuristic. Overall, for larger values map (above 20,000 entries), I decreased search time to about 10-30% of what I needed before. For smaller value maps, search time stay efficient due to usage of TermsQuery. Thanks again! Josef -----Original Message----- From: Trejkaz [mailto:trej...@trypticon.org] Sent: Wednesday, June 27, 2018 4:51 AM To: Lucene Users Mailing List Subject: Re: Efficient way to define large Boolean Occur.FILTER clause in Lucene 6 On Tue, Jun 26, 2018 at 7:02 PM, Hasenberger, Josef <josef.hasenber...@zetcom.com> wrote: > However, I have a feeling that the conversion from Long values to Terms is > rather inefficient for large collections and also uses a lot of memory. > To ease conversion overhead somewhat, I created a class that converts a > Long value directly to BytesRef instance (in order to avoid conversion to > UTF16 and then UTF8 again) and pass that instance to the Term constructor. First thought is, why are you using TermsQuery if they're in DocValues? Is DocValuesTermsQuery any better? It does depend on how many terms you're searching for. Second thought is that there is also DocValuesNumbersQuery, which avoids having to convert all the values. > I just wonder if there is a better method for passing large amount of filter > criteria > to a BooleanQuery Occur.FILTER clause, that avoids excessive object creation. If you can get your long values into something which implements Bits, you could make a query using RandomAccessWeight to directly point at the existing set you already have in memory. TX --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org