[
https://issues.apache.org/jira/browse/LUCENE-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17313877#comment-17313877
]
Greg Miller edited comment on LUCENE-9850 at 4/2/21, 1:41 PM:
--------------------------------------------------------------
I haven't really been following along with what's going on in JDK17, but being
able to more explicitly generate vectorized instructions will be nice!
I optimized my branch a bit further. At this point I think it has all the
optimizations of ForDeltaUtil, and the only extra work is applying the
exceptions. I even pulled in the optimization to apply the prefix to two values
at a time (packed in a long). That code is over
[here|https://github.com/apache/lucene/compare/main...gsmiller:pfordocid-opto3].
(It's not polished at all... a bit hacky)
If you look at the code, one thing that may not be obvious is how
encodePrefixSum differs from encode. In ForDeltaUtil, the 0 bpv case implies
that all values were "1", while PForUtil encodes the actual "same" value,
allowing it to take on values other than "1". While it's pretty trivial, this
meant checking that value in decoding and special-casing 1 vs. non-1 values. I
took on the ForDeltaUtil logic here and assume it's always 1. It's minor and I
suspect we could remove this optimization, but I'm trying to get the decode
logic as close to ForDeltaUtil as possible.
The benchmark results are looking at lot better now, but maybe still some
regressions. I've seen a little variability in these results, so I'm not sure
how often they might present false-regression results on individual tasks?
Here's what I've got at this point:
{code:java}
TaskQPS baseline StdDevQPS pfordocids StdDev
Pct diff p-value
LowSpanNear 98.61 (2.2%) 95.57 (1.7%)
-3.1% ( -6% - 0%) 0.000
OrNotHighHigh 545.01 (3.8%) 531.11 (5.3%)
-2.6% ( -11% - 6%) 0.078
Wildcard 40.83 (4.1%) 40.05 (3.9%)
-1.9% ( -9% - 6%) 0.132
OrHighMed 102.39 (2.5%) 100.50 (2.5%)
-1.8% ( -6% - 3%) 0.021
AndHighHigh 50.93 (3.3%) 50.03 (3.1%)
-1.8% ( -7% - 4%) 0.079
TermDTSort 98.42 (11.6%) 96.72 (14.4%)
-1.7% ( -24% - 27%) 0.676
AndHighMed 68.10 (2.9%) 66.94 (2.9%)
-1.7% ( -7% - 4%) 0.063
HighTerm 1169.43 (4.4%) 1151.70 (5.1%)
-1.5% ( -10% - 8%) 0.314
BrowseMonthSSDVFacets 12.50 (5.6%) 12.31 (7.7%)
-1.5% ( -14% - 12%) 0.480
HighTermTitleBDVSort 157.42 (14.7%) 155.08 (15.6%)
-1.5% ( -27% - 33%) 0.757
OrHighNotLow 545.83 (5.7%) 537.85 (7.0%)
-1.5% ( -13% - 12%) 0.472
MedSpanNear 28.75 (2.4%) 28.34 (1.9%)
-1.4% ( -5% - 2%) 0.038
OrHighNotHigh 533.41 (4.6%) 526.33 (5.3%)
-1.3% ( -10% - 8%) 0.394
Fuzzy1 59.47 (6.0%) 58.72 (6.8%)
-1.3% ( -13% - 12%) 0.533
HighSpanNear 21.27 (2.6%) 21.03 (2.2%)
-1.1% ( -5% - 3%) 0.153
HighTermMonthSort 128.50 (12.2%) 127.11 (11.0%)
-1.1% ( -21% - 25%) 0.769
OrNotHighLow 640.89 (4.0%) 634.43 (3.5%)
-1.0% ( -8% - 6%) 0.395
OrHighHigh 21.11 (2.0%) 20.91 (1.8%)
-1.0% ( -4% - 2%) 0.113
MedPhrase 103.90 (3.0%) 103.05 (2.9%)
-0.8% ( -6% - 5%) 0.381
HighPhrase 172.59 (2.5%) 171.22 (2.5%)
-0.8% ( -5% - 4%) 0.320
OrHighNotMed 535.67 (4.8%) 531.54 (4.7%)
-0.8% ( -9% - 9%) 0.607
LowTerm 1094.41 (2.9%) 1087.97 (3.1%)
-0.6% ( -6% - 5%) 0.535
MedSloppyPhrase 12.91 (2.4%) 12.85 (2.5%)
-0.5% ( -5% - 4%) 0.542
IntNRQ 101.21 (0.5%) 100.81 (0.7%)
-0.4% ( -1% - 0%) 0.040
PKLookup 144.62 (3.0%) 144.11 (3.1%)
-0.4% ( -6% - 5%) 0.715
HighSloppyPhrase 3.75 (2.9%) 3.74 (3.0%)
-0.3% ( -6% - 5%) 0.726
HighIntervalsOrdered 16.00 (2.1%) 15.95 (1.8%)
-0.3% ( -4% - 3%) 0.597
HighTermDayOfYearSort 109.37 (11.0%) 109.03 (15.2%)
-0.3% ( -23% - 29%) 0.941
LowSloppyPhrase 41.05 (1.9%) 40.93 (2.1%)
-0.3% ( -4% - 3%) 0.635
MedTerm 1137.13 (4.1%) 1134.84 (4.2%)
-0.2% ( -8% - 8%) 0.877
BrowseDayOfYearTaxoFacets 4.24 (3.4%) 4.23 (3.2%)
-0.2% ( -6% - 6%) 0.885
Prefix3 263.31 (9.1%) 263.08 (9.2%)
-0.1% ( -16% - 20%) 0.976
BrowseDateTaxoFacets 4.23 (3.4%) 4.23 (3.2%)
-0.1% ( -6% - 6%) 0.941
Fuzzy2 44.14 (13.1%) 44.17 (14.2%)
0.1% ( -24% - 31%) 0.990
BrowseMonthTaxoFacets 5.00 (2.6%) 5.01 (2.2%)
0.1% ( -4% - 4%) 0.878
OrNotHighMed 508.22 (3.6%) 508.82 (3.9%)
0.1% ( -7% - 7%) 0.921
Respell 38.75 (2.3%) 38.82 (2.2%)
0.2% ( -4% - 4%) 0.791
BrowseDayOfYearSSDVFacets 11.38 (5.7%) 11.42 (5.8%)
0.3% ( -10% - 12%) 0.855
OrHighLow 220.12 (4.0%) 220.98 (3.7%)
0.4% ( -6% - 8%) 0.745
AndHighLow 620.94 (3.0%) 624.41 (3.3%)
0.6% ( -5% - 7%) 0.573
LowPhrase 132.77 (2.1%) 133.52 (2.4%)
0.6% ( -3% - 5%) 0.433
{code}
was (Author: gsmiller):
I haven't really been following along with what's going on in JDK17, but being
able to more explicitly generate vectorized instructions will be nice!
I optimized my branch a bit further. At this point I think it has all the
optimizations of ForDeltaUtil, and the only extra work is applying the
exceptions. I even pulled in the optimization to apply the prefix to two values
at a time (packed in a long). That code is over
[here|https://github.com/apache/lucene/compare/main...gsmiller:pfordocid-opto3].
(It's not polished at all... a bit hacky)
The benchmark results are looking at lot better now, but maybe still some
regressions. I've seen a little variability in these results, so I'm not sure
how often they might present false-regression results on individual tasks?
Here's what I've got at this point:
{code:java}
TaskQPS baseline StdDevQPS pfordocids StdDev
Pct diff p-value
LowSpanNear 98.61 (2.2%) 95.57 (1.7%)
-3.1% ( -6% - 0%) 0.000
OrNotHighHigh 545.01 (3.8%) 531.11 (5.3%)
-2.6% ( -11% - 6%) 0.078
Wildcard 40.83 (4.1%) 40.05 (3.9%)
-1.9% ( -9% - 6%) 0.132
OrHighMed 102.39 (2.5%) 100.50 (2.5%)
-1.8% ( -6% - 3%) 0.021
AndHighHigh 50.93 (3.3%) 50.03 (3.1%)
-1.8% ( -7% - 4%) 0.079
TermDTSort 98.42 (11.6%) 96.72 (14.4%)
-1.7% ( -24% - 27%) 0.676
AndHighMed 68.10 (2.9%) 66.94 (2.9%)
-1.7% ( -7% - 4%) 0.063
HighTerm 1169.43 (4.4%) 1151.70 (5.1%)
-1.5% ( -10% - 8%) 0.314
BrowseMonthSSDVFacets 12.50 (5.6%) 12.31 (7.7%)
-1.5% ( -14% - 12%) 0.480
HighTermTitleBDVSort 157.42 (14.7%) 155.08 (15.6%)
-1.5% ( -27% - 33%) 0.757
OrHighNotLow 545.83 (5.7%) 537.85 (7.0%)
-1.5% ( -13% - 12%) 0.472
MedSpanNear 28.75 (2.4%) 28.34 (1.9%)
-1.4% ( -5% - 2%) 0.038
OrHighNotHigh 533.41 (4.6%) 526.33 (5.3%)
-1.3% ( -10% - 8%) 0.394
Fuzzy1 59.47 (6.0%) 58.72 (6.8%)
-1.3% ( -13% - 12%) 0.533
HighSpanNear 21.27 (2.6%) 21.03 (2.2%)
-1.1% ( -5% - 3%) 0.153
HighTermMonthSort 128.50 (12.2%) 127.11 (11.0%)
-1.1% ( -21% - 25%) 0.769
OrNotHighLow 640.89 (4.0%) 634.43 (3.5%)
-1.0% ( -8% - 6%) 0.395
OrHighHigh 21.11 (2.0%) 20.91 (1.8%)
-1.0% ( -4% - 2%) 0.113
MedPhrase 103.90 (3.0%) 103.05 (2.9%)
-0.8% ( -6% - 5%) 0.381
HighPhrase 172.59 (2.5%) 171.22 (2.5%)
-0.8% ( -5% - 4%) 0.320
OrHighNotMed 535.67 (4.8%) 531.54 (4.7%)
-0.8% ( -9% - 9%) 0.607
LowTerm 1094.41 (2.9%) 1087.97 (3.1%)
-0.6% ( -6% - 5%) 0.535
MedSloppyPhrase 12.91 (2.4%) 12.85 (2.5%)
-0.5% ( -5% - 4%) 0.542
IntNRQ 101.21 (0.5%) 100.81 (0.7%)
-0.4% ( -1% - 0%) 0.040
PKLookup 144.62 (3.0%) 144.11 (3.1%)
-0.4% ( -6% - 5%) 0.715
HighSloppyPhrase 3.75 (2.9%) 3.74 (3.0%)
-0.3% ( -6% - 5%) 0.726
HighIntervalsOrdered 16.00 (2.1%) 15.95 (1.8%)
-0.3% ( -4% - 3%) 0.597
HighTermDayOfYearSort 109.37 (11.0%) 109.03 (15.2%)
-0.3% ( -23% - 29%) 0.941
LowSloppyPhrase 41.05 (1.9%) 40.93 (2.1%)
-0.3% ( -4% - 3%) 0.635
MedTerm 1137.13 (4.1%) 1134.84 (4.2%)
-0.2% ( -8% - 8%) 0.877
BrowseDayOfYearTaxoFacets 4.24 (3.4%) 4.23 (3.2%)
-0.2% ( -6% - 6%) 0.885
Prefix3 263.31 (9.1%) 263.08 (9.2%)
-0.1% ( -16% - 20%) 0.976
BrowseDateTaxoFacets 4.23 (3.4%) 4.23 (3.2%)
-0.1% ( -6% - 6%) 0.941
Fuzzy2 44.14 (13.1%) 44.17 (14.2%)
0.1% ( -24% - 31%) 0.990
BrowseMonthTaxoFacets 5.00 (2.6%) 5.01 (2.2%)
0.1% ( -4% - 4%) 0.878
OrNotHighMed 508.22 (3.6%) 508.82 (3.9%)
0.1% ( -7% - 7%) 0.921
Respell 38.75 (2.3%) 38.82 (2.2%)
0.2% ( -4% - 4%) 0.791
BrowseDayOfYearSSDVFacets 11.38 (5.7%) 11.42 (5.8%)
0.3% ( -10% - 12%) 0.855
OrHighLow 220.12 (4.0%) 220.98 (3.7%)
0.4% ( -6% - 8%) 0.745
AndHighLow 620.94 (3.0%) 624.41 (3.3%)
0.6% ( -5% - 7%) 0.573
LowPhrase 132.77 (2.1%) 133.52 (2.4%)
0.6% ( -3% - 5%) 0.433
{code}
> Explore PFOR for Doc ID delta encoding (instead of FOR)
> -------------------------------------------------------
>
> Key: LUCENE-9850
> URL: https://issues.apache.org/jira/browse/LUCENE-9850
> Project: Lucene - Core
> Issue Type: Task
> Components: core/codecs
> Affects Versions: main (9.0)
> Reporter: Greg Miller
> Priority: Minor
>
> It'd be interesting to explore using PFOR instead of FOR for doc ID encoding.
> Right now PFOR is used for positions, frequencies and payloads, but FOR is
> used for doc ID deltas. From a recent
> [conversation|http://mail-archives.apache.org/mod_mbox/lucene-dev/202103.mbox/%3CCAPsWd%2BOp7d_GxNosB5r%3DQMPA-v0SteHWjXUmG3gwQot4gkubWw%40mail.gmail.com%3E]
> on the dev mailing list, it sounds like this decision was made based on the
> optimization possible when expanding the deltas.
> I'd be interesting in measuring the index size reduction possible with
> switching to PFOR compared to the performance reduction we might see by no
> longer being able to apply the deltas in as optimal a way.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]