rmuir commented on code in PR #12055: URL: https://github.com/apache/lucene/pull/12055#discussion_r1059804843
########## lucene/core/src/java/org/apache/lucene/search/MultiTermQueryConstantScoreWrapper.java: ########## @@ -183,23 +182,31 @@ private WeightOrDocIdSet rewrite(LeafReaderContext context) throws IOException { } Query q = new ConstantScoreQuery(bq.build()); final Weight weight = searcher.rewrite(q).createWeight(searcher, scoreMode, score()); - return new WeightOrDocIdSet(weight); + return new WeightOrDocIdSetIterator(weight); } // Too many terms: go back to the terms we already collected and start building the bit set - DocIdSetBuilder builder = new DocIdSetBuilder(context.reader().maxDoc(), terms); + PriorityQueue<PostingsEnum> highFrequencyTerms = + new PriorityQueue<PostingsEnum>(collectedTerms.size()) { + @Override + protected boolean lessThan(PostingsEnum a, PostingsEnum b) { + return a.cost() < b.cost(); Review Comment: `pq.insertWithOverflow` uses `!lessThan()` in its code. So I'm worried about this PQ behaving stupidly on ties with the same `docFreq`. Is there a simple tiebreaker we can use (even synthetic such as `int termId`) so that such ties don't enter the PQ? I'm just concerned about "collect remaining terms" piece for cases where there are jazillions of terms. should also allow the IO to be a bit more sequential in such cases, rather than constantly replacing top of PQ with more ties? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org