[ https://issues.apache.org/jira/browse/LUCENE-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17313877#comment-17313877 ]
Greg Miller edited comment on LUCENE-9850 at 4/2/21, 1:41 PM: -------------------------------------------------------------- I haven't really been following along with what's going on in JDK17, but being able to more explicitly generate vectorized instructions will be nice! I optimized my branch a bit further. At this point I think it has all the optimizations of ForDeltaUtil, and the only extra work is applying the exceptions. I even pulled in the optimization to apply the prefix to two values at a time (packed in a long). That code is over [here|https://github.com/apache/lucene/compare/main...gsmiller:pfordocid-opto3]. (It's not polished at all... a bit hacky) If you look at the code, one thing that may not be obvious is how encodePrefixSum differs from encode. In ForDeltaUtil, the 0 bpv case implies that all values were "1", while PForUtil encodes the actual "same" value, allowing it to take on values other than "1". While it's pretty trivial, this meant checking that value in decoding and special-casing 1 vs. non-1 values. I took on the ForDeltaUtil logic here and assume it's always 1. It's minor and I suspect we could remove this optimization, but I'm trying to get the decode logic as close to ForDeltaUtil as possible. The benchmark results are looking at lot better now, but maybe still some regressions. I've seen a little variability in these results, so I'm not sure how often they might present false-regression results on individual tasks? Here's what I've got at this point: {code:java} TaskQPS baseline StdDevQPS pfordocids StdDev Pct diff p-value LowSpanNear 98.61 (2.2%) 95.57 (1.7%) -3.1% ( -6% - 0%) 0.000 OrNotHighHigh 545.01 (3.8%) 531.11 (5.3%) -2.6% ( -11% - 6%) 0.078 Wildcard 40.83 (4.1%) 40.05 (3.9%) -1.9% ( -9% - 6%) 0.132 OrHighMed 102.39 (2.5%) 100.50 (2.5%) -1.8% ( -6% - 3%) 0.021 AndHighHigh 50.93 (3.3%) 50.03 (3.1%) -1.8% ( -7% - 4%) 0.079 TermDTSort 98.42 (11.6%) 96.72 (14.4%) -1.7% ( -24% - 27%) 0.676 AndHighMed 68.10 (2.9%) 66.94 (2.9%) -1.7% ( -7% - 4%) 0.063 HighTerm 1169.43 (4.4%) 1151.70 (5.1%) -1.5% ( -10% - 8%) 0.314 BrowseMonthSSDVFacets 12.50 (5.6%) 12.31 (7.7%) -1.5% ( -14% - 12%) 0.480 HighTermTitleBDVSort 157.42 (14.7%) 155.08 (15.6%) -1.5% ( -27% - 33%) 0.757 OrHighNotLow 545.83 (5.7%) 537.85 (7.0%) -1.5% ( -13% - 12%) 0.472 MedSpanNear 28.75 (2.4%) 28.34 (1.9%) -1.4% ( -5% - 2%) 0.038 OrHighNotHigh 533.41 (4.6%) 526.33 (5.3%) -1.3% ( -10% - 8%) 0.394 Fuzzy1 59.47 (6.0%) 58.72 (6.8%) -1.3% ( -13% - 12%) 0.533 HighSpanNear 21.27 (2.6%) 21.03 (2.2%) -1.1% ( -5% - 3%) 0.153 HighTermMonthSort 128.50 (12.2%) 127.11 (11.0%) -1.1% ( -21% - 25%) 0.769 OrNotHighLow 640.89 (4.0%) 634.43 (3.5%) -1.0% ( -8% - 6%) 0.395 OrHighHigh 21.11 (2.0%) 20.91 (1.8%) -1.0% ( -4% - 2%) 0.113 MedPhrase 103.90 (3.0%) 103.05 (2.9%) -0.8% ( -6% - 5%) 0.381 HighPhrase 172.59 (2.5%) 171.22 (2.5%) -0.8% ( -5% - 4%) 0.320 OrHighNotMed 535.67 (4.8%) 531.54 (4.7%) -0.8% ( -9% - 9%) 0.607 LowTerm 1094.41 (2.9%) 1087.97 (3.1%) -0.6% ( -6% - 5%) 0.535 MedSloppyPhrase 12.91 (2.4%) 12.85 (2.5%) -0.5% ( -5% - 4%) 0.542 IntNRQ 101.21 (0.5%) 100.81 (0.7%) -0.4% ( -1% - 0%) 0.040 PKLookup 144.62 (3.0%) 144.11 (3.1%) -0.4% ( -6% - 5%) 0.715 HighSloppyPhrase 3.75 (2.9%) 3.74 (3.0%) -0.3% ( -6% - 5%) 0.726 HighIntervalsOrdered 16.00 (2.1%) 15.95 (1.8%) -0.3% ( -4% - 3%) 0.597 HighTermDayOfYearSort 109.37 (11.0%) 109.03 (15.2%) -0.3% ( -23% - 29%) 0.941 LowSloppyPhrase 41.05 (1.9%) 40.93 (2.1%) -0.3% ( -4% - 3%) 0.635 MedTerm 1137.13 (4.1%) 1134.84 (4.2%) -0.2% ( -8% - 8%) 0.877 BrowseDayOfYearTaxoFacets 4.24 (3.4%) 4.23 (3.2%) -0.2% ( -6% - 6%) 0.885 Prefix3 263.31 (9.1%) 263.08 (9.2%) -0.1% ( -16% - 20%) 0.976 BrowseDateTaxoFacets 4.23 (3.4%) 4.23 (3.2%) -0.1% ( -6% - 6%) 0.941 Fuzzy2 44.14 (13.1%) 44.17 (14.2%) 0.1% ( -24% - 31%) 0.990 BrowseMonthTaxoFacets 5.00 (2.6%) 5.01 (2.2%) 0.1% ( -4% - 4%) 0.878 OrNotHighMed 508.22 (3.6%) 508.82 (3.9%) 0.1% ( -7% - 7%) 0.921 Respell 38.75 (2.3%) 38.82 (2.2%) 0.2% ( -4% - 4%) 0.791 BrowseDayOfYearSSDVFacets 11.38 (5.7%) 11.42 (5.8%) 0.3% ( -10% - 12%) 0.855 OrHighLow 220.12 (4.0%) 220.98 (3.7%) 0.4% ( -6% - 8%) 0.745 AndHighLow 620.94 (3.0%) 624.41 (3.3%) 0.6% ( -5% - 7%) 0.573 LowPhrase 132.77 (2.1%) 133.52 (2.4%) 0.6% ( -3% - 5%) 0.433 {code} was (Author: gsmiller): I haven't really been following along with what's going on in JDK17, but being able to more explicitly generate vectorized instructions will be nice! I optimized my branch a bit further. At this point I think it has all the optimizations of ForDeltaUtil, and the only extra work is applying the exceptions. I even pulled in the optimization to apply the prefix to two values at a time (packed in a long). That code is over [here|https://github.com/apache/lucene/compare/main...gsmiller:pfordocid-opto3]. (It's not polished at all... a bit hacky) The benchmark results are looking at lot better now, but maybe still some regressions. I've seen a little variability in these results, so I'm not sure how often they might present false-regression results on individual tasks? Here's what I've got at this point: {code:java} TaskQPS baseline StdDevQPS pfordocids StdDev Pct diff p-value LowSpanNear 98.61 (2.2%) 95.57 (1.7%) -3.1% ( -6% - 0%) 0.000 OrNotHighHigh 545.01 (3.8%) 531.11 (5.3%) -2.6% ( -11% - 6%) 0.078 Wildcard 40.83 (4.1%) 40.05 (3.9%) -1.9% ( -9% - 6%) 0.132 OrHighMed 102.39 (2.5%) 100.50 (2.5%) -1.8% ( -6% - 3%) 0.021 AndHighHigh 50.93 (3.3%) 50.03 (3.1%) -1.8% ( -7% - 4%) 0.079 TermDTSort 98.42 (11.6%) 96.72 (14.4%) -1.7% ( -24% - 27%) 0.676 AndHighMed 68.10 (2.9%) 66.94 (2.9%) -1.7% ( -7% - 4%) 0.063 HighTerm 1169.43 (4.4%) 1151.70 (5.1%) -1.5% ( -10% - 8%) 0.314 BrowseMonthSSDVFacets 12.50 (5.6%) 12.31 (7.7%) -1.5% ( -14% - 12%) 0.480 HighTermTitleBDVSort 157.42 (14.7%) 155.08 (15.6%) -1.5% ( -27% - 33%) 0.757 OrHighNotLow 545.83 (5.7%) 537.85 (7.0%) -1.5% ( -13% - 12%) 0.472 MedSpanNear 28.75 (2.4%) 28.34 (1.9%) -1.4% ( -5% - 2%) 0.038 OrHighNotHigh 533.41 (4.6%) 526.33 (5.3%) -1.3% ( -10% - 8%) 0.394 Fuzzy1 59.47 (6.0%) 58.72 (6.8%) -1.3% ( -13% - 12%) 0.533 HighSpanNear 21.27 (2.6%) 21.03 (2.2%) -1.1% ( -5% - 3%) 0.153 HighTermMonthSort 128.50 (12.2%) 127.11 (11.0%) -1.1% ( -21% - 25%) 0.769 OrNotHighLow 640.89 (4.0%) 634.43 (3.5%) -1.0% ( -8% - 6%) 0.395 OrHighHigh 21.11 (2.0%) 20.91 (1.8%) -1.0% ( -4% - 2%) 0.113 MedPhrase 103.90 (3.0%) 103.05 (2.9%) -0.8% ( -6% - 5%) 0.381 HighPhrase 172.59 (2.5%) 171.22 (2.5%) -0.8% ( -5% - 4%) 0.320 OrHighNotMed 535.67 (4.8%) 531.54 (4.7%) -0.8% ( -9% - 9%) 0.607 LowTerm 1094.41 (2.9%) 1087.97 (3.1%) -0.6% ( -6% - 5%) 0.535 MedSloppyPhrase 12.91 (2.4%) 12.85 (2.5%) -0.5% ( -5% - 4%) 0.542 IntNRQ 101.21 (0.5%) 100.81 (0.7%) -0.4% ( -1% - 0%) 0.040 PKLookup 144.62 (3.0%) 144.11 (3.1%) -0.4% ( -6% - 5%) 0.715 HighSloppyPhrase 3.75 (2.9%) 3.74 (3.0%) -0.3% ( -6% - 5%) 0.726 HighIntervalsOrdered 16.00 (2.1%) 15.95 (1.8%) -0.3% ( -4% - 3%) 0.597 HighTermDayOfYearSort 109.37 (11.0%) 109.03 (15.2%) -0.3% ( -23% - 29%) 0.941 LowSloppyPhrase 41.05 (1.9%) 40.93 (2.1%) -0.3% ( -4% - 3%) 0.635 MedTerm 1137.13 (4.1%) 1134.84 (4.2%) -0.2% ( -8% - 8%) 0.877 BrowseDayOfYearTaxoFacets 4.24 (3.4%) 4.23 (3.2%) -0.2% ( -6% - 6%) 0.885 Prefix3 263.31 (9.1%) 263.08 (9.2%) -0.1% ( -16% - 20%) 0.976 BrowseDateTaxoFacets 4.23 (3.4%) 4.23 (3.2%) -0.1% ( -6% - 6%) 0.941 Fuzzy2 44.14 (13.1%) 44.17 (14.2%) 0.1% ( -24% - 31%) 0.990 BrowseMonthTaxoFacets 5.00 (2.6%) 5.01 (2.2%) 0.1% ( -4% - 4%) 0.878 OrNotHighMed 508.22 (3.6%) 508.82 (3.9%) 0.1% ( -7% - 7%) 0.921 Respell 38.75 (2.3%) 38.82 (2.2%) 0.2% ( -4% - 4%) 0.791 BrowseDayOfYearSSDVFacets 11.38 (5.7%) 11.42 (5.8%) 0.3% ( -10% - 12%) 0.855 OrHighLow 220.12 (4.0%) 220.98 (3.7%) 0.4% ( -6% - 8%) 0.745 AndHighLow 620.94 (3.0%) 624.41 (3.3%) 0.6% ( -5% - 7%) 0.573 LowPhrase 132.77 (2.1%) 133.52 (2.4%) 0.6% ( -3% - 5%) 0.433 {code} > Explore PFOR for Doc ID delta encoding (instead of FOR) > ------------------------------------------------------- > > Key: LUCENE-9850 > URL: https://issues.apache.org/jira/browse/LUCENE-9850 > Project: Lucene - Core > Issue Type: Task > Components: core/codecs > Affects Versions: main (9.0) > Reporter: Greg Miller > Priority: Minor > > It'd be interesting to explore using PFOR instead of FOR for doc ID encoding. > Right now PFOR is used for positions, frequencies and payloads, but FOR is > used for doc ID deltas. From a recent > [conversation|http://mail-archives.apache.org/mod_mbox/lucene-dev/202103.mbox/%3CCAPsWd%2BOp7d_GxNosB5r%3DQMPA-v0SteHWjXUmG3gwQot4gkubWw%40mail.gmail.com%3E] > on the dev mailing list, it sounds like this decision was made based on the > optimization possible when expanding the deltas. > I'd be interesting in measuring the index size reduction possible with > switching to PFOR compared to the performance reduction we might see by no > longer being able to apply the deltas in as optimal a way. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org