[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses

Michael McCandless (JIRA) Mon, 18 Aug 2014 05:47:28 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100596#comment-14100596
 ]


Michael McCandless commented on LUCENE-4396:
--------------------------------------------

And.tasks perf:

{noformat}
Report after iter 19:
                    Task    QPS base      StdDev    QPS comp      StdDev        
        Pct diff
          HighAnd5LowNot       15.44      (5.9%)       14.86      (1.8%)   
-3.8% ( -10% -    4%)
         HighAnd5HighNot        2.95      (2.0%)        2.93      (1.1%)   
-0.8% (  -3% -    2%)
         HighAnd60LowNot        2.18      (1.9%)        2.17      (1.5%)   
-0.5% (  -3% -    2%)
          LowAnd5HighNot       11.21      (2.0%)       11.17      (1.8%)   
-0.3% (  -4% -    3%)
           LowAnd5HighOr       16.29      (2.5%)       16.27      (2.1%)   
-0.1% (  -4% -    4%)
           LowAnd5LowNot       46.98      (3.1%)       47.02      (2.3%)    
0.1% (  -5% -    5%)
          HighAnd60LowOr        2.00      (2.1%)        2.01      (1.6%)    
0.3% (  -3% -    3%)
          HighAnd5HighOr        2.71      (2.9%)        2.73      (1.4%)    
1.0% (  -3% -    5%)
           HighAnd5LowOr       16.17     (10.5%)       16.49      (2.6%)    
2.0% ( -10% -   16%)
            LowAnd5LowOr       45.06      (4.2%)       46.11      (3.2%)    
2.3% (  -4% -   10%)
         LowAnd60HighNot        1.03      (1.8%)        1.59      (3.4%)   
53.4% (  47% -   59%)
          LowAnd60HighOr        0.87      (1.7%)        1.44      (3.7%)   
64.7% (  58% -   71%)
          LowAnd60LowNot        3.55      (2.1%)        6.68      (3.3%)   
88.2% (  81% -   95%)
           LowAnd60LowOr        3.66      (1.6%)        7.05      (3.6%)   
92.5% (  85% -   99%)
        HighAnd60HighNot        0.19      (2.1%)        0.39      (3.8%)  
106.4% (  98% -  114%)
         HighAnd60HighOr        0.15      (1.6%)        0.34      (3.2%)  
119.0% ( 112% -  125%)
{noformat}

> BooleanScorer should sometimes be used for MUST clauses
> -------------------------------------------------------
>
>                 Key: LUCENE-4396
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4396
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>         Attachments: And.tasks, And.tasks, AndOr.tasks, AndOr.tasks, 
> LUCENE-4396-simple.patch, LUCENE-4396-simple.patch, LUCENE-4396-simple.patch, 
> LUCENE-4396-simple.patch, LUCENE-4396.patch, LUCENE-4396.patch, 
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, 
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, 
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, 
> LUCENE-4396.patch, LUCENE-4396.patch, SIZE.perf, all.perf, 
> luceneutil-score-equal.patch, luceneutil-score-equal.patch, 
> merge-simple.perf, merge-simple.png, merge.perf, merge.png, perf.png, 
> stat.cpp, stat.cpp, tasks.cpp
>
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared 
> to the other clauses, that BooleanScorer would perform better than 
> BooleanScorer2.  BooleanScorer still has some vestiges from when it used to 
> handle MUST so it shouldn't be hard to bring back this capability ... I think 
> the challenging part might be the heuristics on when to use which (likely we 
> would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs 
> in this case, eg if suddenly the MUST clause skips 1000000 docs then you want 
> to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you 
> are inspired!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses

Reply via email to