[ 
https://issues.apache.org/jira/browse/LUCENE-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13582337#comment-13582337
 ] 

Robert Muir commented on LUCENE-4571:
-------------------------------------

Stefan this looks very promising! I think we should add support for this query 
to luceneutil and try it out.

About the docfreq idea, a scorer/disi cost estimation patch exists in at least 
two places. For example termscorer returns docfreq. Disjunctions return sum 
over their subscorers. Actually this speeds up conjunctions in general and 
removes the need for the specialized conjunctiontermscorer. I think it would be 
useful here too? I'll find the link and add it to a comment in a bit
                
> speedup disjunction with minShouldMatch 
> ----------------------------------------
>
>                 Key: LUCENE-4571
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4571
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>    Affects Versions: 4.1
>            Reporter: Mikhail Khludnev
>         Attachments: LUCENE-4571.patch
>
>
> even minShouldMatch is supplied to DisjunctionSumScorer it enumerates whole 
> disjunction, and verifies minShouldMatch condition [on every 
> doc|https://github.com/apache/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/DisjunctionSumScorer.java#L70]:
> {code}
>   public int nextDoc() throws IOException {
>     assert doc != NO_MORE_DOCS;
>     while(true) {
>       while (subScorers[0].docID() == doc) {
>         if (subScorers[0].nextDoc() != NO_MORE_DOCS) {
>           heapAdjust(0);
>         } else {
>           heapRemoveRoot();
>           if (numScorers < minimumNrMatchers) {
>             return doc = NO_MORE_DOCS;
>           }
>         }
>       }
>       afterNext();
>       if (nrMatchers >= minimumNrMatchers) {
>         break;
>       }
>     }
>     
>     return doc;
>   }
> {code}
> [~spo] proposes (as well as I get it) to pop nrMatchers-1 scorers from the 
> heap first, and then push them back advancing behind that top doc. For me the 
> question no.1 is there a performance test for minShouldMatch constrained 
> disjunction. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to