[jira] [Commented] (LUCENE-9071) Speed up computation of BM25 scores

Adrien Grand (Jira) Thu, 28 Nov 2019 01:35:51 -0800


    [ 
https://issues.apache.org/jira/browse/LUCENE-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984267#comment-16984267
 ]


Adrien Grand commented on LUCENE-9071:
--------------------------------------

I'm getting a small but consistently reproducible speedup for boolean/term 
queries, so I believe it is not noise. Here is the output of one run on 
wikibigall for instance:

{noformat}
                    TaskQPS baseline      StdDev   QPS patch      StdDev        
        Pct diff
                SpanNear        1.37     (20.9%)        1.36     (20.7%)   
-0.5% ( -34% -   51%)
                PKLookup      195.89      (3.5%)      195.36      (3.8%)   
-0.3% (  -7% -    7%)
                 Prefix3       59.17      (6.7%)       59.13      (6.6%)   
-0.1% ( -12% -   14%)
   HighTermDayOfYearSort       46.46      (5.8%)       46.44      (6.1%)   
-0.1% ( -11% -   12%)
       HighTermMonthSort       65.80     (12.4%)       65.80     (12.3%)   
-0.0% ( -22% -   28%)
                  Fuzzy1      160.52     (12.7%)      160.58     (13.2%)    
0.0% ( -22% -   29%)
        IntervalsOrdered       10.76      (3.3%)       10.76      (3.2%)    
0.1% (  -6% -    6%)
                Wildcard      101.27      (3.7%)      101.33      (4.2%)    
0.1% (  -7% -    8%)
            SloppyPhrase        6.32      (7.2%)        6.33      (7.2%)    
0.1% ( -13% -   15%)
                  Fuzzy2       80.13      (8.3%)       80.42      (9.2%)    
0.4% ( -15% -   19%)
         AndHighOrMedMed       37.69      (2.2%)       37.90      (1.9%)    
0.6% (  -3% -    4%)
                  Phrase       10.95      (2.5%)       11.03      (2.4%)    
0.8% (  -4% -    5%)
        AndMedOrHighHigh       28.77      (2.7%)       29.14      (3.1%)    
1.3% (  -4% -    7%)
                  IntNRQ       92.24      (3.1%)       94.13      (3.4%)    
2.1% (  -4% -    8%)
              AndHighMed       51.84      (3.0%)       52.92      (3.3%)    
2.1% (  -4% -    8%)
                    Term     1375.94      (2.3%)     1405.17      (3.1%)    
2.1% (  -3% -    7%)
               OrHighMed       71.51      (2.3%)       73.32      (2.9%)    
2.5% (  -2% -    7%)
              OrHighHigh       86.33      (2.2%)       89.26      (3.0%)    
3.4% (  -1% -    8%)
             AndHighHigh       36.11      (2.9%)       37.34      (3.9%)    
3.4% (  -3% -   10%)
{noformat}

> Speed up computation of BM25 scores
> -----------------------------------
>
>                 Key: LUCENE-9071
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9071
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We changed the way BM25 scores are computed in LUCENE-7997 in order to 
> guarantee monotonicity of scores, but this translated to a small decrease of 
> throughput, see annotation CC (October 2017) on Mike's nightly benchmarks. 
> Even though the total number of score computations has decreased since we 
> introduced block-max WAND, its relative cost is not negligible since we not 
> only compute scores on collected documents, but also when decoding skip lists 
> in order to compute the maximum score per block, or group of blocks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9071) Speed up computation of BM25 scores

Reply via email to