txwei commented on PR #16240: URL: https://github.com/apache/lucene/pull/16240#issuecomment-4678160871
I haven't had the chance to go through the entire PR, but I got some benchmark results from my luceneutil [branch](https://github.com/mikemccand/luceneutil/compare/main...txwei:luceneutil:leading-wildcard-query?expand=1) that tests query `WildcardLeadingAndMissing: +body:/.*qmzxwvbb.*/ +body:zzznomatchqqq` ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value WildcardLeadingAndMissing 2231.02 (4.9%) 222.61 (0.6%) -90.0% ( -91% - -88%) 0.000 Wildcard 92.50 (4.7%) 84.02 (3.3%) -9.2% ( -16% - -1%) 0.000 BrowseMonthTaxoFacets 40.43 (18.0%) 37.43 (25.3%) -7.4% ( -43% - 43%) 0.285 BrowseRandomLabelTaxoFacets 40.36 (42.9%) 37.79 (43.1%) -6.4% ( -64% - 139%) 0.639 BrowseDateSSDVFacets 4.64 (18.3%) 4.34 (16.7%) -6.3% ( -34% - 34%) 0.251 BrowseDateTaxoFacets 36.70 (26.8%) 34.90 (27.9%) -4.9% ( -47% - 67%) 0.570 AndMissingHigh 5089.20 (7.8%) 4843.00 (8.2%) -4.8% ( -19% - 12%) 0.056 BrowseDayOfYearTaxoFacets 37.11 (26.7%) 35.35 (28.0%) -4.7% ( -46% - 68%) 0.583 BrowseRandomLabelSSDVFacets 20.53 (4.8%) 20.05 (3.3%) -2.4% ( -9% - 6%) 0.070 BrowseMonthSSDVFacets 27.47 (6.0%) 26.98 (4.3%) -1.8% ( -11% - 9%) 0.281 LowPhrase 462.65 (5.9%) 454.92 (7.1%) -1.7% ( -13% - 12%) 0.421 MedPhrase 240.31 (4.5%) 236.66 (3.7%) -1.5% ( -9% - 6%) 0.243 HighTermTitleBDVSort 107.48 (3.1%) 105.89 (3.6%) -1.5% ( -7% - 5%) 0.164 MedIntervalsOrdered 778.45 (6.4%) 767.22 (7.1%) -1.4% ( -13% - 12%) 0.497 Prefix3 1422.93 (5.2%) 1405.62 (6.0%) -1.2% ( -11% - 10%) 0.494 OrHighHigh 584.20 (8.6%) 579.03 (7.2%) -0.9% ( -15% - 16%) 0.723 OrHighMedDayTaxoFacets 36.62 (3.6%) 36.32 (3.7%) -0.8% ( -7% - 6%) 0.483 LowIntervalsOrdered 433.68 (4.1%) 431.29 (3.9%) -0.6% ( -8% - 7%) 0.663 HighIntervalsOrdered 160.33 (5.2%) 159.66 (6.4%) -0.4% ( -11% - 11%) 0.821 AndHighHigh 543.48 (5.8%) 541.30 (5.6%) -0.4% ( -11% - 11%) 0.823 HighSloppyPhrase 103.10 (3.8%) 102.72 (3.0%) -0.4% ( -6% - 6%) 0.736 OrHighNotMed 929.40 (8.7%) 926.12 (7.4%) -0.4% ( -15% - 17%) 0.890 AndHighMedDayTaxoFacets 322.35 (2.7%) 321.21 (2.8%) -0.4% ( -5% - 5%) 0.685 OrHighNotHigh 553.78 (5.5%) 552.27 (6.7%) -0.3% ( -11% - 12%) 0.888 BrowseDayOfYearSSDVFacets 26.73 (4.1%) 26.70 (4.0%) -0.1% ( -7% - 8%) 0.927 range 8114.33 (6.0%) 8109.09 (7.3%) -0.1% ( -12% - 14%) 0.976 Fuzzy1 173.02 (3.1%) 173.14 (3.4%) 0.1% ( -6% - 6%) 0.947 MedTermDayTaxoFacets 163.97 (1.9%) 164.15 (2.2%) 0.1% ( -3% - 4%) 0.864 TermDTSort 500.51 (4.8%) 501.10 (7.3%) 0.1% ( -11% - 12%) 0.952 OrNotHighHigh 530.40 (5.6%) 531.17 (7.1%) 0.1% ( -11% - 13%) 0.943 LowSpanNear 491.59 (3.1%) 493.37 (3.0%) 0.4% ( -5% - 6%) 0.704 MedTerm 2414.76 (9.1%) 2423.75 (10.2%) 0.4% ( -17% - 21%) 0.903 MedSloppyPhrase 430.49 (4.3%) 432.78 (3.3%) 0.5% ( -6% - 8%) 0.662 OrHighMed 1209.13 (4.9%) 1215.97 (7.3%) 0.6% ( -11% - 13%) 0.773 HighTermTitleSort 223.33 (5.4%) 224.71 (6.4%) 0.6% ( -10% - 13%) 0.742 Fuzzy2 128.13 (2.3%) 128.95 (1.6%) 0.6% ( -3% - 4%) 0.307 Respell 100.27 (2.8%) 100.94 (1.9%) 0.7% ( -3% - 5%) 0.380 PKLookup 538.59 (4.5%) 542.19 (3.9%) 0.7% ( -7% - 9%) 0.616 HighTermMonthSort 2007.20 (4.4%) 2023.86 (4.9%) 0.8% ( -8% - 10%) 0.572 LowSloppyPhrase 200.75 (3.4%) 202.43 (3.3%) 0.8% ( -5% - 7%) 0.424 OrHighLow 1754.67 (5.8%) 1769.78 (5.0%) 0.9% ( -9% - 12%) 0.615 OrNotHighMed 651.53 (6.7%) 657.15 (5.9%) 0.9% ( -10% - 14%) 0.665 AndHighHighDayTaxoFacets 99.83 (2.0%) 100.70 (2.6%) 0.9% ( -3% - 5%) 0.229 IntNRQ 365.55 (3.4%) 369.87 (4.2%) 1.2% ( -6% - 9%) 0.324 AndHighMed 1535.25 (4.8%) 1554.12 (4.7%) 1.2% ( -7% - 11%) 0.410 IntSet 1530.59 (11.4%) 1549.68 (15.0%) 1.2% ( -22% - 31%) 0.767 MedSpanNear 589.53 (2.9%) 597.05 (2.6%) 1.3% ( -4% - 6%) 0.140 AndHighLow 2813.83 (5.3%) 2856.49 (4.3%) 1.5% ( -7% - 11%) 0.318 HighPhrase 378.67 (6.2%) 385.19 (5.5%) 1.7% ( -9% - 14%) 0.353 OrNotHighLow 2204.56 (5.7%) 2244.83 (5.4%) 1.8% ( -8% - 13%) 0.300 HighSpanNear 99.72 (4.7%) 101.72 (5.8%) 2.0% ( -8% - 13%) 0.231 OrHighNotLow 1894.41 (5.9%) 1935.91 (6.7%) 2.2% ( -9% - 15%) 0.274 HighTerm 2163.41 (10.6%) 2213.07 (13.4%) 2.3% ( -19% - 29%) 0.547 HighTermDayOfYearSort 534.52 (5.3%) 547.31 (5.8%) 2.4% ( -8% - 14%) 0.171 LowTerm 3253.52 (9.0%) 3354.53 (8.0%) 3.1% ( -12% - 22%) 0.247 ``` Notably this would regress the leading wildcard query by 90% and the `wilcard` by 9.2% -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
