Hi all. We have this one kind of query where you essentially specify a text file which contains the actual query to search for. The catch is that the text file can be large.
Our custom query currently computes the set of matching docs up-front, and then when queries come in for one LeafReader, the larger doc ID set is sliced so that the sub-slice for that leaf is returned. Which is confusing, and seems backwards. As an alternative, we could override rewrite(IndexReader) and return a gigantic boolean query. Problems being: 1) A gigantic BooleanQuery takes up a lot more memory than a list of query strings. 2) Lucene devs often say that gigantic boolean queries are bad, maybe for reason #1, or maybe for another reason which nobody understands So in place of this, is there some kind of alternative? For instance, is there some query type where I can provide an iterator of sub-queries, so that they don't all have to be in memory at once? The code to get each sub-query is always relatively straight-forward and easy to understand. I guess the snag is that sometimes the line of text is natural language which gets run through an analyser, so we'd potentially be re-analysing the text once per leaf reader? :/ This would replace about 1/3 of the remaining places where we have to compute the doc ID set up-front. TX --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org