[ https://issues.apache.org/jira/browse/LUCENE-7958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795905#comment-16795905 ]
Adrien Grand commented on LUCENE-7958: -------------------------------------- Thanks for sharing [~hermes]. I should resurrect the above patch when I have some time! > Give TermInSetQuery better advancing capabilities > ------------------------------------------------- > > Key: LUCENE-7958 > URL: https://issues.apache.org/jira/browse/LUCENE-7958 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Priority: Minor > Attachments: LUCENE-7958.patch > > > If a TermInSetQuery has more than 15 matching terms on a given segment, then > we consume all postings lists into a bitset and return an iterator over this > bitset as a scorer. I would like to change it so that we keep the 15 postings > lists that have the largest document frequencies and consume all other > (shorter) postings lists into a bitset. In the end we return a disjunction > over the N longest postings lists and the bit set. This could help consume > fewer doc ids if the TermInSetQuery is intersected with other queries, > especially if the document frequencies of the terms it wraps have a zipfian > distribution. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org