[jira] [Updated] (LUCENE-6940) Bulk scoring could speed up MUST_NOT clauses

Adrien Grand (JIRA) Wed, 23 Dec 2015 13:58:58 -0800

     [ 
https://issues.apache.org/jira/browse/LUCENE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Adrien Grand updated LUCENE-6940:
---------------------------------
    Attachment: LUCENE-6940.patch

Here is a new patch. This time it has tests and tries to organize the code a 
bit better. Tests pass and luceneutil still reports similar times (this time I 
only ran the default tasks for wikimedium10m):

{noformat}
                    TaskQPS baseline      StdDev   QPS patch      StdDev        
        Pct diff
           OrNotHighHigh       35.34      (2.2%)       32.99      (3.3%)   
-6.7% ( -11% -   -1%)
            OrNotHighMed      174.57      (2.3%)      170.17      (1.8%)   
-2.5% (  -6% -    1%)
               OrHighMed       73.34      (4.6%)       72.13      (5.0%)   
-1.7% ( -10% -    8%)
              OrHighHigh        9.17      (5.7%)        9.03      (5.2%)   
-1.5% ( -11% -    9%)
           OrHighNotHigh       78.91      (2.8%)       78.16      (3.6%)   
-0.9% (  -7% -    5%)
               OrHighLow       19.79      (2.7%)       19.73      (3.6%)   
-0.3% (  -6% -    6%)
         LowSloppyPhrase       70.26      (2.1%)       70.14      (2.3%)   
-0.2% (  -4% -    4%)
              AndHighMed      191.70      (1.7%)      191.59      (1.7%)   
-0.1% (  -3% -    3%)
             AndHighHigh       79.74      (1.0%)       79.79      (0.9%)    
0.1% (  -1% -    1%)
         MedSloppyPhrase       73.28      (2.5%)       73.37      (2.7%)    
0.1% (  -5% -    5%)
                 Respell       84.33      (2.4%)       84.60      (2.8%)    
0.3% (  -4% -    5%)
             LowSpanNear       12.79      (3.9%)       12.83      (3.0%)    
0.4% (  -6% -    7%)
               LowPhrase       48.59      (1.2%)       48.79      (1.2%)    
0.4% (  -2% -    2%)
               MedPhrase       33.55      (1.4%)       33.71      (1.3%)    
0.5% (  -2% -    3%)
            HighSpanNear       14.50      (3.2%)       14.60      (2.6%)    
0.7% (  -4% -    6%)
             MedSpanNear      151.02      (3.3%)      152.17      (1.7%)    
0.8% (  -4% -    5%)
        HighSloppyPhrase       15.76      (5.3%)       15.90      (5.2%)    
0.9% (  -9% -   12%)
              HighPhrase       32.51      (2.3%)       33.09      (1.4%)    
1.8% (  -1% -    5%)
                 Prefix3       90.59      (8.9%)       92.74      (7.5%)    
2.4% ( -12% -   20%)
                Wildcard      125.13      (8.2%)      128.21      (7.8%)    
2.5% ( -12% -   20%)
                 MedTerm      291.05      (6.8%)      300.34      (6.5%)    
3.2% (  -9% -   17%)
                  Fuzzy1       61.93      (8.8%)       64.08      (9.6%)    
3.5% ( -13% -   23%)
                HighTerm       79.63      (7.3%)       83.28      (6.9%)    
4.6% (  -8% -   20%)
                  IntNRQ       10.39     (13.8%)       10.94     (11.5%)    
5.3% ( -17% -   35%)
                 LowTerm      575.82     (12.7%)      607.32     (10.6%)    
5.5% ( -15% -   33%)
            OrNotHighLow      985.95      (4.4%)     1054.73      (3.0%)    
7.0% (   0% -   15%)
              AndHighLow      688.12      (8.2%)      736.65      (4.5%)    
7.1% (  -5% -   21%)
                  Fuzzy2       58.94     (14.4%)       63.15      (8.9%)    
7.2% ( -14% -   35%)
            OrHighNotMed       84.50      (3.4%)       95.46      (3.8%)   
13.0% (   5% -   20%)
            OrHighNotLow       64.23      (3.3%)       76.36      (4.7%)   
18.9% (  10% -   27%)
{noformat}

> Bulk scoring could speed up MUST_NOT clauses
> --------------------------------------------
>
>                 Key: LUCENE-6940
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6940
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-6940.patch, LUCENE-6940.patch
>
>
> Today when you have MUST_NOT clauses, the ReqExclScorer is used and needs to 
> check the excluded clauses on every iteration. I suspect we could speed 
> things up by having a BulkScorer that would advance the excluded clause first 
> and then tell the required clause to bulk score up to the next excluded 
> document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (LUCENE-6940) Bulk scoring could speed up MUST_NOT clauses

Reply via email to