[jira] [Created] (LUCENE-8432) Stop calling comparator even if early termination is not possible

Nikolay Khitrin (JIRA) Fri, 27 Jul 2018 04:10:18 -0700

Nikolay Khitrin created LUCENE-8432:
---------------------------------------


             Summary: Stop calling comparator even if early termination is not 
possible
                 Key: LUCENE-8432
                 URL: https://issues.apache.org/jira/browse/LUCENE-8432
             Project: Lucene - Core
          Issue Type: Improvement
          Components: core/search
    Affects Versions: 7.3
            Reporter: Nikolay Khitrin


TopFieldCollector continues calling comparator.compareBottom even if result is 
known in advance due to document order when trackMaxScore or trackTotalHits is 
set.

Comparator call is not very cheap because it can involve DV read from disk and 
all calls can be avoided after first non competitive segment document is 
reached.

There is a patch and luceneutil report on wikimedium10m sorted by DayOfYear:
{noformat}
                    TaskQPS baseline      StdDev   QPS patch      StdDev        
        Pct diff

       HighTermMonthSort      226.04      (6.3%)      215.33      (4.3%)   
-4.7% ( -14% -    6%)

                 LowTerm      933.27      (5.5%)      924.62      (4.2%)   
-0.9% ( -10% -    9%)

            OrNotHighLow      945.68      (5.7%)      939.12      (4.5%)   
-0.7% ( -10% -   10%)

             MedSpanNear       28.76      (1.4%)       28.61      (1.5%)   
-0.5% (  -3% -    2%)

BrowseDayOfYearSSDVFacets       16.36      (5.0%)       16.29      (4.5%)   
-0.4% (  -9% -    9%)

              AndHighMed      112.30      (2.9%)      111.96      (1.6%)   
-0.3% (  -4% -    4%)

             LowSpanNear       12.42      (1.5%)       12.38      (1.6%)   
-0.3% (  -3% -    2%)

        HighSloppyPhrase       18.66      (3.9%)       18.62      (4.0%)   
-0.2% (  -7% -    7%)

               MedPhrase      219.40      (2.7%)      219.06      (2.7%)   
-0.2% (  -5% -    5%)

            OrNotHighMed      222.88      (3.2%)      222.63      (3.4%)   
-0.1% (  -6% -    6%)

              AndHighLow      521.59      (3.5%)      521.02      (4.5%)   
-0.1% (  -7% -    8%)

         MedSloppyPhrase       16.71      (4.7%)       16.70      (4.7%)   
-0.0% (  -8% -    9%)

               LowPhrase       15.58      (2.5%)       15.59      (2.9%)    
0.0% (  -5% -    5%)

                 Respell       92.05      (2.4%)       92.19      (3.0%)    
0.2% (  -5% -    5%)

            HighSpanNear       17.03      (2.2%)       17.06      (2.1%)    
0.2% (  -4% -    4%)

              HighPhrase       37.85      (5.8%)       37.92      (5.9%)    
0.2% ( -10% -   12%)

            OrHighNotLow      118.25      (2.9%)      118.47      (3.5%)    
0.2% (  -6% -    6%)

   BrowseMonthTaxoFacets        2.94      (0.4%)        2.94      (0.8%)    
0.2% (   0% -    1%)

    BrowseDateTaxoFacets        2.75      (0.3%)        2.75      (1.6%)    
0.3% (  -1% -    2%)

         LowSloppyPhrase      105.28      (2.3%)      105.60      (2.5%)    
0.3% (  -4% -    5%)

                 Prefix3      122.07      (6.8%)      122.55      (6.5%)    
0.4% ( -12% -   14%)

           OrNotHighHigh       55.07      (3.8%)       55.29      (4.5%)    
0.4% (  -7% -    8%)

   BrowseMonthSSDVFacets       20.88      (7.2%)       20.99      (7.5%)    
0.5% ( -13% -   16%)

           OrHighNotHigh       58.40      (4.2%)       58.72      (4.8%)    
0.6% (  -8% -    9%)

                Wildcard       79.87      (3.7%)       80.31      (4.0%)    
0.6% (  -6% -    8%)

               OrHighMed       13.25      (4.3%)       13.34      (4.9%)    
0.6% (  -8% -   10%)

BrowseDayOfYearTaxoFacets        2.73      (0.6%)        2.75      (1.6%)    
0.7% (  -1% -    2%)

              OrHighHigh       22.03      (4.1%)       22.19      (4.9%)    
0.7% (  -8% -   10%)

             AndHighHigh       23.46      (2.1%)       23.63      (1.9%)    
0.7% (  -3% -    4%)

                PKLookup      145.59      (4.2%)      146.66      (4.3%)    
0.7% (  -7% -    9%)

                 MedTerm      171.13      (5.0%)      172.43      (5.1%)    
0.8% (  -8% -   11%)

               OrHighLow      119.22      (2.8%)      120.23      (3.1%)    
0.8% (  -4% -    6%)

            OrHighNotMed       87.06      (3.7%)       87.80      (4.1%)    
0.8% (  -6% -    8%)

                  IntNRQ       26.44     (12.8%)       26.68     (11.5%)    
0.9% ( -20% -   28%)

                HighTerm      107.64      (6.1%)      108.88      (5.6%)    
1.2% (  -9% -   13%)

                  Fuzzy2       69.69     (10.7%)       71.64      (7.4%)    
2.8% ( -13% -   23%)

                  Fuzzy1       53.95      (6.5%)       55.79      (6.2%)    
3.4% (  -8% -   17%)

   HighTermDayOfYearSort       19.71      (4.7%)       21.51      (7.1%)    
9.1% (  -2% -   21%){noformat}
Unfortunately, luceneutil shows regression on non index sort match sorting 
(HighTermMonthSort). I can't reproduce the regression on any real case, but I'm 
afraid my benchmarks isn't quite accurate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (LUCENE-8432) Stop calling comparator even if early termination is not possible

Reply via email to