[
https://issues.apache.org/jira/browse/LUCENE-6218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305637#comment-14305637
]
Robert Muir commented on LUCENE-6218:
-------------------------------------
Here is the standard benchmark. You can see the optimization happening for the
MUST_NOT clauses:
{noformat}
Task QPS trunk StdDev QPS patch StdDev
Pct diff
OrHighNotLow 108.19 (4.1%) 105.11 (6.7%)
-2.8% ( -13% - 8%)
OrHighNotMed 89.28 (3.7%) 87.15 (6.3%)
-2.4% ( -11% - 7%)
HighTerm 120.82 (5.1%) 118.25 (6.1%)
-2.1% ( -12% - 9%)
MedTerm 177.26 (4.8%) 173.98 (5.8%)
-1.9% ( -11% - 9%)
LowTerm 950.16 (4.4%) 934.26 (4.6%)
-1.7% ( -10% - 7%)
OrHighNotHigh 29.55 (3.2%) 29.14 (5.7%)
-1.4% ( -9% - 7%)
MedSpanNear 144.83 (3.7%) 143.30 (4.5%)
-1.1% ( -8% - 7%)
Wildcard 45.54 (5.3%) 45.17 (6.1%)
-0.8% ( -11% - 11%)
Prefix3 214.45 (5.5%) 213.06 (7.6%)
-0.6% ( -13% - 13%)
LowSpanNear 28.04 (2.7%) 27.86 (3.3%)
-0.6% ( -6% - 5%)
AndHighLow 1171.37 (2.4%) 1165.20 (3.0%)
-0.5% ( -5% - 5%)
HighSpanNear 144.44 (3.9%) 143.73 (5.0%)
-0.5% ( -9% - 8%)
OrNotHighHigh 49.49 (3.2%) 49.25 (5.8%)
-0.5% ( -9% - 8%)
IntNRQ 8.45 (7.7%) 8.41 (10.3%)
-0.5% ( -17% - 19%)
AndHighHigh 88.18 (1.6%) 87.78 (1.9%)
-0.5% ( -3% - 3%)
AndHighMed 123.35 (1.7%) 123.11 (1.8%)
-0.2% ( -3% - 3%)
Respell 89.47 (1.9%) 89.44 (1.4%)
-0.0% ( -3% - 3%)
Fuzzy1 109.20 (1.8%) 109.63 (1.3%)
0.4% ( -2% - 3%)
Fuzzy2 67.56 (2.1%) 67.85 (1.5%)
0.4% ( -3% - 4%)
LowPhrase 34.54 (2.0%) 34.76 (1.9%)
0.6% ( -3% - 4%)
LowSloppyPhrase 119.91 (2.6%) 120.75 (2.4%)
0.7% ( -4% - 5%)
OrHighHigh 27.37 (9.3%) 27.71 (8.6%)
1.2% ( -15% - 21%)
OrHighMed 58.23 (8.7%) 58.97 (8.0%)
1.3% ( -14% - 19%)
OrHighLow 56.42 (8.7%) 57.23 (7.9%)
1.4% ( -13% - 19%)
MedSloppyPhrase 15.92 (4.0%) 16.19 (4.3%)
1.7% ( -6% - 10%)
HighSloppyPhrase 13.52 (12.1%) 13.77 (8.6%)
1.9% ( -16% - 25%)
HighPhrase 17.50 (4.5%) 17.99 (4.2%)
2.8% ( -5% - 12%)
MedPhrase 253.02 (5.7%) 261.32 (6.1%)
3.3% ( -8% - 15%)
OrNotHighMed 185.01 (1.9%) 205.45 (3.6%)
11.0% ( 5% - 16%)
OrNotHighLow 959.96 (2.2%) 1144.49 (3.5%)
19.2% ( 13% - 25%)
{noformat}
> don't decode freqs or enumerate all positions, when scores are not needed
> -------------------------------------------------------------------------
>
> Key: LUCENE-6218
> URL: https://issues.apache.org/jira/browse/LUCENE-6218
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
> Attachments: LUCENE-6218.patch
>
>
> Today if you don't call score() some things are faster, we won't invoke
> similarity or read the norm for the document or other things.
> On the other hand, its sad in this case that we are decompressing twice as
> many packed integers as we need (freqs can be skipped over, and our postings
> lists supports that) and walking all positions in phrase matching to
> determine the number of times the phrase matched (1 is enough, then we can
> stop).
> When scoring is not needed, things can be optimized in other cases too (e.g.
> thats the whole concept of filters).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]