shimpeko commented on PR #15659:
URL: https://github.com/apache/lucene/pull/15659#issuecomment-3853027797

   I added test case that are closed to my query which has dismax + 
constant_score as https://github.com/shimpeko/luceneutil/pull/1/changes. 
Looking at the following benchmark result, I think I can say that the changes 
on this PR has significant positive impact on the performance of specifc type 
of query.
   
   > As far as block-max optimizations are concerned, DisjunctionBulkMaxScorer 
tracks the min competitive score and passes it to its sub clauses whenever 
scoring a window
   
   This seems true, so I don't cleary understand why using 
DisjunctionBulkMaxScorer causing regression in this paticular case, yet. 
@jpountz do you have any idea.
   
   
   Result (50 warmups, 50 iter)
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                         DismaxTerm     1034.68     (11.6%)      935.85     
(11.7%)   -9.6% ( -29% -   15%) 0.000
                    DismaxOrHighMed      224.64     (16.2%)      206.15     
(16.7%)   -8.2% ( -35% -   29%) 0.012
                 FilteredDismaxTerm      201.44      (9.5%)      189.45     
(12.1%)   -5.9% ( -25% -   17%) 0.006
           FilteredDismaxOrHighHigh       61.88     (14.1%)       59.53     
(11.3%)   -3.8% ( -25% -   25%) 0.137
                   DismaxOrHighHigh       99.12     (13.0%)       95.37     
(12.9%)   -3.8% ( -26% -   25%) 0.143
                           PKLookup      240.78     (14.9%)      232.55     
(15.0%)   -3.4% ( -29% -   31%) 0.254
            FilteredDismaxOrHighMed      170.69     (14.1%)      166.51     
(13.5%)   -2.5% ( -26% -   29%) 0.374
                      DisMaxCsTerm1     1814.59     (12.8%)     2010.12     
(16.9%)   10.8% ( -16% -   46%) 0.000
                     DisMaxCSTerm20      169.78     (12.3%)      307.85     
(43.2%)   81.3% (  22% -  156%) 0.000
   
   ```
   
   <details>
   <summary>Commands to run bench and task detail</summary>
   
   ```
   util % cd ../lucene_candidate && git show -s --oneline HEAD && cd ../util
   68ada56464 (HEAD -> dismax-bulk-heuristic, origin/dismax-bulk-heuristic) 
./gradlew tidy --rerun-tasks
   util % cd ../lucene_baseline && git show -s --oneline HEAD && cd ../util
   7ebdb9316e5 (HEAD -> main, origin/main, origin/HEAD) Add next minor version 
10.5.0
   util % cd ../lucene_candidate && git show -s --oneline HEAD && cd ../util
   68ada56464 (HEAD -> dismax-bulk-heuristic, origin/dismax-bulk-heuristic) 
./gradlew tidy --rerun-tasks
   util % grep -A 5 'sourceData =' src/python/localrun.py                   
     sourceData = competition.Data(
       "wikimediumall",
       constants.WIKI_MEDIUM_DOCS_LINE_FILE,
       constants.WIKI_MEDIUM_DOCS_COUNT,
       constants.DISMAX_TASKS_FILE,
     )
   util % cat src/python/localconstants.py 
   import os
   
   BASE_DIR = 
'/Users/shimpei-kodama/github.com/mikemccand/luceneutil/bench_home'
   BENCH_BASE_DIR = 
'/Users/shimpei-kodama/github.com/mikemccand/luceneutil/bench_home/util'
   DISMAX_TASKS_FILE = '%s/tasks/dismax_constantscore.tasks' % BENCH_BASE_DIR
   
   #JAVA_HOME = os.environ.get("JAVA_HOME")
   #java_bin = JAVA_HOME + "/bin/" if JAVA_HOME else ""
   #if java_bin:
   #  print("Using java from: %s" % java_bin)
   #if "JAVA_EXE" not in globals():
   #  JAVA_EXE = f"{java_bin}java"
   #if "JAVAC_EXE" not in globals():
   #  JAVAC_EXE = f"{java_bin}javac"
   #if "JAVA_COMMAND" not in globals():
   #  JAVA_COMMAND = "%s -server -Xms2g -Xmx2g --add-modules 
jdk.incubator.vector -XX:+HeapDumpOnOutOfMemoryError -XX:+UseParallelGC  
-Dlucene.dismax.debug=true" % JAVA_EXE
   #else:
   #  print("use java command %s" % JAVA_COMMAND)  # pyright: 
ignore[reportUndefinedVariable] # TODO: fix how variables are managed here
   util % python src/python/localrun.py --iteration=50 --warmups=50 -b 
/Users/shimpei-kodama/github.com/mikemccand/luceneutil/bench_home/lucene_baseline
 -c 
/Users/shimpei-kodama/github.com/mikemccand/luceneutil/bench_home/lucene_candidate
 > result.txt
   ```
   </summary>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to