[ https://issues.apache.org/jira/browse/LUCENE-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17404486#comment-17404486 ]
Adrien Grand commented on LUCENE-9613: -------------------------------------- By removing the wrapping of NumericDocValues by SortedDocValues (see attached PR) I get even better numbers (the baseline has the above change, so this speedup is on top of the previous one). {noformat} TaskQPS baseline StdDev QPS patch StdDev Pct diff p-value LowSloppyPhrase 138.15 (4.5%) 135.56 (4.0%) -1.9% ( -9% - 6%) 0.163 HighSloppyPhrase 38.34 (5.5%) 37.80 (5.7%) -1.4% ( -11% - 10%) 0.432 MedSloppyPhrase 68.13 (4.2%) 67.27 (3.6%) -1.3% ( -8% - 6%) 0.305 MedSpanNear 97.29 (2.4%) 96.13 (2.8%) -1.2% ( -6% - 4%) 0.150 Respell 198.32 (5.0%) 195.98 (5.0%) -1.2% ( -10% - 9%) 0.456 HighSpanNear 26.55 (3.7%) 26.24 (6.1%) -1.2% ( -10% - 8%) 0.462 LowSpanNear 15.53 (2.9%) 15.37 (3.6%) -1.0% ( -7% - 5%) 0.330 HighIntervalsOrdered 20.91 (6.2%) 20.73 (5.5%) -0.9% ( -11% - 11%) 0.633 Wildcard 248.72 (13.6%) 246.63 (14.4%) -0.8% ( -25% - 31%) 0.849 LowIntervalsOrdered 241.69 (6.7%) 239.97 (6.4%) -0.7% ( -12% - 13%) 0.731 LowPhrase 54.03 (3.3%) 53.67 (2.9%) -0.7% ( -6% - 5%) 0.495 OrNotHighLow 607.98 (3.2%) 604.28 (3.1%) -0.6% ( -6% - 5%) 0.542 MedIntervalsOrdered 33.22 (3.4%) 33.03 (2.8%) -0.6% ( -6% - 5%) 0.562 MedPhrase 292.37 (3.8%) 290.84 (3.5%) -0.5% ( -7% - 6%) 0.648 HighTermTitleBDVSort 21.76 (2.2%) 21.65 (3.1%) -0.5% ( -5% - 4%) 0.564 HighTerm 2062.70 (4.0%) 2053.13 (4.2%) -0.5% ( -8% - 8%) 0.722 OrHighLow 619.26 (2.9%) 616.86 (2.9%) -0.4% ( -5% - 5%) 0.669 AndHighLow 922.79 (4.8%) 919.30 (4.1%) -0.4% ( -8% - 8%) 0.788 Prefix3 409.80 (6.5%) 408.28 (6.6%) -0.4% ( -12% - 13%) 0.857 OrHighNotMed 1354.26 (3.8%) 1349.40 (4.2%) -0.4% ( -8% - 7%) 0.777 AndHighHigh 55.31 (4.0%) 55.14 (5.0%) -0.3% ( -8% - 9%) 0.838 IntNRQ 190.47 (1.0%) 189.99 (0.6%) -0.3% ( -1% - 1%) 0.351 AndHighMed 310.69 (5.0%) 310.04 (5.3%) -0.2% ( -9% - 10%) 0.898 HighPhrase 210.32 (2.2%) 209.90 (1.9%) -0.2% ( -4% - 4%) 0.763 TermDTSort 108.34 (3.2%) 108.15 (3.1%) -0.2% ( -6% - 6%) 0.856 OrNotHighMed 1059.23 (2.8%) 1057.74 (3.4%) -0.1% ( -6% - 6%) 0.887 OrHighNotLow 919.86 (3.1%) 919.37 (3.2%) -0.1% ( -6% - 6%) 0.957 MedTerm 2131.16 (3.7%) 2140.13 (4.5%) 0.4% ( -7% - 8%) 0.747 OrNotHighHigh 1217.26 (3.1%) 1222.56 (3.9%) 0.4% ( -6% - 7%) 0.698 HighTermDayOfYearSort 91.07 (7.1%) 91.73 (7.0%) 0.7% ( -12% - 15%) 0.745 OrHighNotHigh 924.82 (3.3%) 931.81 (3.6%) 0.8% ( -5% - 7%) 0.486 Fuzzy1 66.97 (5.9%) 67.57 (7.0%) 0.9% ( -11% - 14%) 0.657 OrHighHigh 26.63 (3.1%) 26.88 (3.5%) 0.9% ( -5% - 7%) 0.373 OrHighMed 100.56 (3.3%) 101.63 (3.4%) 1.1% ( -5% - 8%) 0.315 LowTerm 3005.79 (6.2%) 3044.90 (5.7%) 1.3% ( -9% - 14%) 0.490 Fuzzy2 151.86 (10.1%) 154.03 (9.2%) 1.4% ( -16% - 23%) 0.642 BrowseDateTaxoFacets 3.12 (5.6%) 3.17 (3.8%) 1.8% ( -7% - 11%) 0.235 BrowseMonthTaxoFacets 3.44 (5.3%) 3.50 (4.3%) 1.9% ( -7% - 12%) 0.211 BrowseDayOfYearTaxoFacets 3.12 (5.6%) 3.18 (4.0%) 2.0% ( -7% - 12%) 0.202 HighTermMonthSort 70.56 (9.8%) 74.99 (10.5%) 6.3% ( -12% - 29%) 0.051 BrowseMonthSSDVFacets 14.44 (4.9%) 18.97 (34.9%) 31.4% ( -8% - 74%) 0.000 BrowseDayOfYearSSDVFacets 14.88 (7.2%) 19.77 (32.7%) 32.9% ( -6% - 78%) 0.000 {noformat} The change might be a bit more controversial given that it requires checking some of the numeric optimizations, which is why I didn't push it right away. > Create blocks for ords when it helps in Lucene80DocValuesFormat > --------------------------------------------------------------- > > Key: LUCENE-9613 > URL: https://issues.apache.org/jira/browse/LUCENE-9613 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Priority: Minor > Fix For: main (9.0) > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Currently for sorted(-set) values, we always write ords using > log2(valueCount) bits per entry. However in several cases like when the field > is used in the index sort, or if one value is _very_common, splitting into > blocks like we do for numerics would help. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org