zacharymorn commented on a change in pull request #113:
URL: https://github.com/apache/lucene/pull/113#discussion_r631568912



##########
File path: lucene/core/src/java/org/apache/lucene/search/BMMBulkScorer.java
##########
@@ -0,0 +1,317 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import static org.apache.lucene.search.ScorerUtil.costWithMinShouldMatch;
+
+import java.io.IOException;
+import java.util.*;
+import org.apache.lucene.util.Bits;
+
+/** BulkScorer that leverages BMM algorithm within interval (min, max) */
+public class BMMBulkScorer extends BulkScorer {
+  private List<Scorer> scorers;
+  private DisiWrapper[] allScorers;
+  private Weight weight;
+  private ScoreMode scoreMode;
+  private int scalingFactor;
+  private long cost;
+  private static final int FIXED_WINDOW_SIZE = 2048;

Review comment:
       I also ran wikibigall for the above changes as well following the 
suggestions from 
https://github.com/apache/lucene/pull/101#issuecomment-837909869, and got the 
following results:
   
   wikibigall run 1
   ```
                       TaskQPS baseline      StdDevQPS my_modified_version      
StdDev                Pct diff p-value
                     Fuzzy1       51.08      (9.7%)       44.15     (11.5%)  
-13.6% ( -31% -    8%) 0.000
                     Fuzzy2       51.90     (12.0%)       48.39     (10.3%)   
-6.8% ( -25% -   17%) 0.056
          TermDayOfYearSort      160.58     (12.1%)      156.99     (11.2%)   
-2.2% ( -22% -   23%) 0.542
              TermMonthSort       58.61      (8.4%)       57.57      (8.0%)   
-1.8% ( -16% -   15%) 0.494
                 TermDTSort      104.08     (11.4%)      102.37      (9.4%)   
-1.6% ( -20% -   21%) 0.619
              TermTitleSort      104.99      (8.3%)      103.29      (7.6%)   
-1.6% ( -16% -   15%) 0.519
            AndHighOrMedMed       33.52      (3.1%)       33.02      (2.8%)   
-1.5% (  -7% -    4%) 0.114
                AndHighHigh       18.08      (4.6%)       17.87      (4.1%)   
-1.1% (  -9% -    7%) 0.406
               TermBGroup1M       14.14      (4.1%)       14.03      (3.5%)   
-0.8% (  -8% -    7%) 0.491
             TermDateFacets        7.58      (5.5%)        7.53      (6.1%)   
-0.7% ( -11% -   11%) 0.714
                     Phrase       10.38      (1.8%)       10.32      (2.2%)   
-0.6% (  -4% -    3%) 0.359
                 AndHighMed       82.33      (4.0%)       81.87      (3.9%)   
-0.6% (  -8% -    7%) 0.655
               SloppyPhrase        2.32      (8.3%)        2.31      (9.9%)   
-0.5% ( -17% -   19%) 0.855
               TermGroup100       34.52      (3.7%)       34.36      (3.0%)   
-0.5% (  -6% -    6%) 0.650
             TermBGroup1M1P       43.50      (3.8%)       43.30      (4.0%)   
-0.5% (  -7% -    7%) 0.700
           AndMedOrHighHigh       25.62      (3.4%)       25.51      (3.1%)   
-0.4% (  -6% -    6%) 0.666
                TermGroup1M       15.43      (3.0%)       15.37      (2.7%)   
-0.4% (  -5% -    5%) 0.668
               VectorSearch      823.98      (1.9%)      820.96      (2.6%)   
-0.4% (  -4% -    4%) 0.616
                   PKLookup      210.69      (2.6%)      210.23      (2.5%)   
-0.2% (  -5% -    4%) 0.782
      BrowseMonthSSDVFacets       18.90      (0.8%)       18.87      (0.9%)   
-0.2% (  -1% -    1%) 0.574
   BrowseDayOfYearTaxoFacets        7.14      (5.3%)        7.14      (5.8%)   
-0.1% ( -10% -   11%) 0.943
                   Wildcard       38.83      (2.4%)       38.79      (2.5%)   
-0.1% (  -4% -    4%) 0.881
               TermGroup10K       18.47      (2.9%)       18.45      (2.5%)   
-0.1% (  -5% -    5%) 0.910
                   SpanNear        4.76      (1.4%)        4.76      (1.3%)   
-0.1% (  -2% -    2%) 0.885
                    Prefix3      173.23      (6.8%)      173.13      (6.8%)   
-0.1% ( -12% -   14%) 0.978
       BrowseDateTaxoFacets        7.46      (5.5%)        7.46      (6.1%)   
-0.1% ( -10% -   12%) 0.976
      BrowseMonthTaxoFacets        8.27      (5.7%)        8.27      (6.4%)   
-0.0% ( -11% -   12%) 0.986
                    Respell       41.11      (2.8%)       41.12      (2.6%)    
0.0% (  -5% -    5%) 0.972
   BrowseDayOfYearSSDVFacets       17.13      (1.8%)       17.14      (1.7%)    
0.1% (  -3% -    3%) 0.887
                     IntNRQ      267.98      (2.2%)      268.69      (2.5%)    
0.3% (  -4% -    5%) 0.721
           IntervalsOrdered        3.79      (2.1%)        3.81      (2.4%)    
0.5% (  -3% -    5%) 0.448
                       Term     1046.89      (7.5%)     1067.03      (7.0%)    
1.9% ( -11% -   17%) 0.401
                  OrHighMed       34.43      (3.2%)       37.66      (5.5%)    
9.4% (   0% -   18%) 0.000
                 OrHighHigh       16.93      (3.7%)       25.19      (4.6%)   
48.8% (  39% -   59%) 0.000
   ```
   
   wikibigall run 2
   ```
                       TaskQPS baseline      StdDevQPS my_modified_version      
StdDev                Pct diff p-value
                     Fuzzy1       50.87      (9.8%)       47.08     (13.8%)   
-7.5% ( -28% -   17%) 0.049
                     Fuzzy2       30.31      (5.6%)       28.38      (9.6%)   
-6.4% ( -20% -    9%) 0.011
              TermMonthSort       59.51     (12.2%)       58.67      (9.4%)   
-1.4% ( -20% -   23%) 0.683
                 TermDTSort      172.78     (11.5%)      170.44     (10.1%)   
-1.4% ( -20% -   22%) 0.692
           AndMedOrHighHigh        9.65      (3.1%)        9.55      (2.8%)   
-1.1% (  -6% -    4%) 0.233
              TermTitleSort       59.25     (12.2%)       58.60      (9.8%)   
-1.1% ( -20% -   23%) 0.754
             TermDateFacets        8.18      (7.5%)        8.13      (7.7%)   
-0.6% ( -14% -   15%) 0.789
                    Respell       46.60      (3.8%)       46.33      (3.6%)   
-0.6% (  -7% -    7%) 0.628
           IntervalsOrdered        3.81      (2.7%)        3.80      (2.8%)   
-0.4% (  -5% -    5%) 0.674
            AndHighOrMedMed       24.00      (3.0%)       23.94      (3.4%)   
-0.3% (  -6% -    6%) 0.792
                 AndHighMed       59.47      (3.1%)       59.34      (3.6%)   
-0.2% (  -6% -    6%) 0.837
                TermGroup1M       22.27      (3.4%)       22.23      (3.5%)   
-0.1% (  -6% -    7%) 0.895
       BrowseDateTaxoFacets        7.28      (7.7%)        7.27      (7.8%)   
-0.1% ( -14% -   16%) 0.956
   BrowseDayOfYearTaxoFacets        6.97      (7.5%)        6.96      (7.5%)   
-0.1% ( -14% -   16%) 0.958
      BrowseMonthTaxoFacets        8.08      (7.7%)        8.07      (7.9%)   
-0.1% ( -14% -   16%) 0.962
                AndHighHigh       64.73      (2.8%)       64.67      (3.7%)   
-0.1% (  -6% -    6%) 0.921
                   Wildcard       70.06      (3.1%)       70.00      (3.2%)   
-0.1% (  -6% -    6%) 0.924
      BrowseMonthSSDVFacets       18.76      (0.9%)       18.77      (0.9%)    
0.0% (  -1% -    1%) 0.919
                     Phrase       20.88      (3.8%)       20.90      (3.2%)    
0.1% (  -6% -    7%) 0.936
               TermGroup10K       12.15      (3.7%)       12.16      (4.0%)    
0.1% (  -7% -    8%) 0.931
             TermBGroup1M1P       15.29      (5.1%)       15.31      (4.6%)    
0.1% (  -9% -   10%) 0.936
                    Prefix3       32.94      (2.9%)       32.99      (2.9%)    
0.1% (  -5% -    6%) 0.872
   BrowseDayOfYearSSDVFacets       17.10      (1.7%)       17.13      (1.7%)    
0.2% (  -3% -    3%) 0.768
               TermGroup100       34.25      (3.8%)       34.34      (3.9%)    
0.3% (  -7% -    8%) 0.829
               SloppyPhrase        2.82      (7.5%)        2.83      (7.4%)    
0.3% ( -13% -   16%) 0.900
          TermDayOfYearSort       45.78     (11.8%)       45.93     (10.6%)    
0.3% ( -19% -   25%) 0.926
                   SpanNear       10.00      (1.2%)       10.05      (1.2%)    
0.4% (  -1% -    2%) 0.253
                     IntNRQ      108.69     (24.1%)      109.25     (23.7%)    
0.5% ( -38% -   63%) 0.945
               TermBGroup1M       11.95      (4.5%)       12.03      (5.2%)    
0.7% (  -8% -   10%) 0.661
                   PKLookup      201.05      (6.0%)      203.48      (4.0%)    
1.2% (  -8% -   11%) 0.451
                       Term      667.45      (5.8%)      683.87      (7.3%)    
2.5% ( -10% -   16%) 0.240
               VectorSearch      989.57      (5.4%)     1021.23      (5.0%)    
3.2% (  -6% -   14%) 0.051
                  OrHighMed       58.35      (3.9%)       69.23      (5.8%)   
18.6% (   8% -   29%) 0.000
                 OrHighHigh       11.04      (3.4%)       16.84      (6.2%)   
52.5% (  41% -   64%) 0.000
   ```
   wikibigall run 3
   ```
                       TaskQPS baseline      StdDevQPS my_modified_version      
StdDev                Pct diff p-value
                     Fuzzy1       56.20     (11.1%)       49.60     (12.0%)  
-11.7% ( -31% -   12%) 0.001
              TermMonthSort       61.43     (11.3%)       57.85     (14.1%)   
-5.8% ( -28% -   21%) 0.148
              TermTitleSort      109.97     (11.2%)      103.85     (14.1%)   
-5.6% ( -27% -   22%) 0.167
                 TermDTSort      160.77     (10.8%)      151.92     (13.5%)   
-5.5% ( -26% -   21%) 0.156
          TermDayOfYearSort       55.50      (7.1%)       52.92     (15.5%)   
-4.6% ( -25% -   19%) 0.222
               TermGroup10K       10.30      (4.6%)       10.02      (7.4%)   
-2.7% ( -14% -    9%) 0.160
                       Term     1037.48      (5.2%)     1010.63      (7.6%)   
-2.6% ( -14% -   10%) 0.210
               TermBGroup1M       21.54      (5.0%)       21.00      (7.4%)   
-2.5% ( -14% -   10%) 0.212
               TermGroup100       18.89      (4.4%)       18.46      (7.8%)   
-2.3% ( -13% -   10%) 0.255
             TermDateFacets       10.29      (9.2%)       10.11      (9.5%)   
-1.8% ( -18% -   18%) 0.536
             TermBGroup1M1P       43.52      (4.9%)       42.88      (5.6%)   
-1.5% ( -11% -    9%) 0.373
                     Fuzzy2       56.25     (13.4%)       55.53     (12.5%)   
-1.3% ( -24% -   28%) 0.754
                TermGroup1M       22.31      (3.8%)       22.04      (5.2%)   
-1.2% (  -9% -    8%) 0.389
           AndMedOrHighHigh       28.60      (2.5%)       28.31      (2.7%)   
-1.0% (  -6% -    4%) 0.222
                     Phrase       59.81      (2.9%)       59.43      (3.1%)   
-0.6% (  -6% -    5%) 0.498
                   PKLookup      205.40      (3.8%)      204.10      (4.9%)   
-0.6% (  -8% -    8%) 0.648
               VectorSearch     1033.68      (4.0%)     1027.88      (4.3%)   
-0.6% (  -8% -    8%) 0.670
       BrowseDateTaxoFacets        7.27      (6.9%)        7.24      (7.0%)   
-0.4% ( -13% -   14%) 0.859
   BrowseDayOfYearTaxoFacets        6.97      (6.6%)        6.94      (6.8%)   
-0.4% ( -12% -   13%) 0.854
               SloppyPhrase       18.29      (2.0%)       18.22      (2.8%)   
-0.4% (  -5% -    4%) 0.612
      BrowseMonthTaxoFacets        8.05      (6.9%)        8.02      (7.0%)   
-0.3% ( -13% -   14%) 0.891
            AndHighOrMedMed       23.88      (2.7%)       23.83      (2.3%)   
-0.2% (  -5% -    4%) 0.774
           IntervalsOrdered        3.83      (2.5%)        3.83      (2.6%)   
-0.1% (  -5% -    5%) 0.862
                     IntNRQ      123.08     (14.8%)      122.93     (15.0%)   
-0.1% ( -26% -   34%) 0.979
                   Wildcard       58.03      (2.7%)       57.97      (3.1%)   
-0.1% (  -5% -    5%) 0.901
   BrowseDayOfYearSSDVFacets       16.93      (1.7%)       16.91      (1.5%)   
-0.1% (  -3% -    3%) 0.851
                    Prefix3      165.67     (10.5%)      165.54      (9.6%)   
-0.1% ( -18% -   22%) 0.980
                   SpanNear        4.76      (1.3%)        4.77      (1.0%)    
0.0% (  -2% -    2%) 0.915
      BrowseMonthSSDVFacets       18.78      (1.4%)       18.80      (1.3%)    
0.1% (  -2% -    2%) 0.815
                    Respell       47.08      (4.1%)       47.19      (4.1%)    
0.2% (  -7% -    8%) 0.851
                AndHighHigh       17.36      (3.4%)       17.50      (3.1%)    
0.8% (  -5% -    7%) 0.435
                 AndHighMed       32.21      (3.6%)       32.50      (3.2%)    
0.9% (  -5% -    7%) 0.406
                  OrHighMed       33.59      (3.2%)       37.09      (3.8%)   
10.4% (   3% -   18%) 0.000
                 OrHighHigh       10.82      (3.7%)       17.08      (4.1%)   
57.8% (  48% -   68%) 0.000
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to