gf2121 edited a comment on pull request #557: URL: https://github.com/apache/lucene/pull/557#issuecomment-999516790
Hi @rmuir @jpountz , Thanks a lot for all talking about this! I think we **probably** find out a better way there: > Actually, what I thought at first was to only change the structure of addresses, implementing a new LongValues to replace the DirectReader or DirectMonotonicReader to read addresses, e.g. a ForUtilLongValues. When users try to get long through an index, It will use ForUtil to decompress the required block (of course, caching the block there and if the next call in the same block we can reuse it). I implemented this idea and see a good benchmark result based on the **newest** luceneutil (wiki10m) without seeing any obvious slower tasks. I raised a new issue about this: https://issues.apache.org/jira/browse/LUCENE-10334 **Edit**: In addition, the new optimization is not exactly optimizing the same palce as this PR. This PR is trying to optimize the BinaryDocValues while the new optimization is trying to use the new reader on `NumericDocValues#LongValue`. This is because the optimization on `NumericDocValues#LongValue` can be easier seen in the newest luceneutil and i think if the the new reader is justified on `NumericDocValues` we can easily port this to the DirectMonotonicReader :) ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value OrNotHighHigh 694.17 (8.2%) 685.83 (7.0%) -1.2% ( -15% - 15%) 0.618 Respell 75.15 (2.7%) 74.32 (2.0%) -1.1% ( -5% - 3%) 0.146 Prefix3 220.11 (5.1%) 217.78 (5.8%) -1.1% ( -11% - 10%) 0.541 Wildcard 129.75 (3.7%) 128.63 (2.5%) -0.9% ( -6% - 5%) 0.383 LowSpanNear 68.54 (2.1%) 68.00 (2.4%) -0.8% ( -5% - 3%) 0.269 OrNotHighMed 732.90 (6.8%) 727.49 (5.3%) -0.7% ( -12% - 12%) 0.703 BrowseRandomLabelTaxoFacets 11879.03 (8.6%) 11799.33 (5.5%) -0.7% ( -13% - 14%) 0.769 HighSloppyPhrase 6.87 (2.9%) 6.83 (2.3%) -0.6% ( -5% - 4%) 0.496 OrHighNotMed 827.54 (9.2%) 822.94 (8.0%) -0.6% ( -16% - 18%) 0.838 MedSpanNear 18.92 (5.7%) 18.82 (5.6%) -0.5% ( -11% - 11%) 0.759 OrHighMedDayTaxoFacets 10.27 (4.0%) 10.21 (4.3%) -0.5% ( -8% - 8%) 0.676 PKLookup 207.98 (4.0%) 206.85 (2.7%) -0.5% ( -7% - 6%) 0.621 LowIntervalsOrdered 159.17 (2.3%) 158.32 (2.2%) -0.5% ( -4% - 3%) 0.445 HighSpanNear 6.32 (4.2%) 6.28 (4.1%) -0.5% ( -8% - 8%) 0.691 MedIntervalsOrdered 85.31 (3.2%) 84.88 (2.9%) -0.5% ( -6% - 5%) 0.607 HighTerm 1170.55 (5.8%) 1164.79 (3.9%) -0.5% ( -9% - 9%) 0.753 LowSloppyPhrase 14.54 (3.1%) 14.48 (2.9%) -0.4% ( -6% - 5%) 0.651 HighPhrase 112.81 (4.4%) 112.39 (4.1%) -0.4% ( -8% - 8%) 0.781 OrNotHighLow 858.02 (5.9%) 854.99 (4.8%) -0.4% ( -10% - 10%) 0.835 HighIntervalsOrdered 25.08 (2.8%) 25.00 (2.6%) -0.3% ( -5% - 5%) 0.701 MedPhrase 27.20 (2.1%) 27.11 (2.9%) -0.3% ( -5% - 4%) 0.689 MedTermDayTaxoFacets 81.55 (2.3%) 81.35 (2.9%) -0.3% ( -5% - 5%) 0.762 IntNRQ 63.36 (2.0%) 63.21 (2.5%) -0.2% ( -4% - 4%) 0.740 Fuzzy2 73.24 (5.5%) 73.10 (6.2%) -0.2% ( -11% - 12%) 0.916 AndHighMedDayTaxoFacets 76.08 (3.5%) 75.98 (3.4%) -0.1% ( -6% - 7%) 0.905 AndHighHigh 62.20 (2.0%) 62.18 (2.4%) -0.0% ( -4% - 4%) 0.954 BrowseMonthTaxoFacets 11993.48 (6.7%) 11989.53 (4.8%) -0.0% ( -10% - 12%) 0.986 OrHighNotLow 732.82 (7.2%) 732.80 (6.2%) -0.0% ( -12% - 14%) 0.999 Fuzzy1 46.43 (5.3%) 46.45 (6.0%) 0.0% ( -10% - 11%) 0.989 LowTerm 1608.25 (6.0%) 1608.84 (4.9%) 0.0% ( -10% - 11%) 0.983 OrHighMed 75.90 (2.3%) 75.93 (1.8%) 0.0% ( -3% - 4%) 0.939 LowPhrase 273.81 (2.9%) 274.04 (3.3%) 0.1% ( -5% - 6%) 0.932 AndHighLow 717.24 (6.1%) 718.17 (3.3%) 0.1% ( -8% - 10%) 0.933 AndHighHighDayTaxoFacets 39.63 (2.5%) 39.69 (2.6%) 0.1% ( -4% - 5%) 0.862 OrHighHigh 34.63 (1.8%) 34.68 (2.0%) 0.1% ( -3% - 4%) 0.821 MedSloppyPhrase 158.80 (2.8%) 159.09 (2.6%) 0.2% ( -5% - 5%) 0.832 OrHighLow 257.77 (2.9%) 258.46 (4.6%) 0.3% ( -7% - 8%) 0.826 AndHighMed 133.43 (2.1%) 133.79 (2.7%) 0.3% ( -4% - 5%) 0.726 HighTermMonthSort 145.28 (10.8%) 145.88 (11.2%) 0.4% ( -19% - 25%) 0.905 OrHighNotHigh 834.99 (6.1%) 839.62 (5.7%) 0.6% ( -10% - 13%) 0.766 TermDTSort 83.66 (9.6%) 84.30 (11.1%) 0.8% ( -18% - 23%) 0.817 BrowseDayOfYearTaxoFacets 11639.59 (5.1%) 11777.38 (6.0%) 1.2% ( -9% - 12%) 0.502 MedTerm 1473.62 (7.4%) 1493.79 (6.4%) 1.4% ( -11% - 16%) 0.530 HighTermTitleBDVSort 114.98 (16.7%) 117.30 (18.8%) 2.0% ( -28% - 45%) 0.720 HighTermDayOfYearSort 128.29 (17.2%) 132.83 (22.6%) 3.5% ( -30% - 52%) 0.577 BrowseDateTaxoFacets 19.25 (20.4%) 26.77 (3.7%) 39.1% ( 12% - 79%) 0.000 BrowseRandomLabelSSDVFacets 10.38 (3.5%) 18.03 (6.8%) 73.7% ( 61% - 87%) 0.000 BrowseMonthSSDVFacets 15.71 (3.6%) 34.59 (12.4%) 120.1% ( 100% - 141%) 0.000 BrowseDayOfYearSSDVFacets 14.31 (3.3%) 33.54 (12.9%) 134.4% ( 114% - 155%) 0.000 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
