[
https://issues.apache.org/jira/browse/LUCENE-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adrien Grand updated LUCENE-6198:
---------------------------------
Attachment: LUCENE-6198.patch
New patch that adds two-phase support to ConjunctionScorer. luceneutil seems
happy with the patch too:
{noformat}
TaskQPS baseline StdDev QPS patch StdDev
Pct diff
HighPhrase 12.26 (11.3%) 11.89 (5.3%)
-3.0% ( -17% - 15%)
AndHighLow 894.95 (9.5%) 874.08 (2.9%)
-2.3% ( -13% - 11%)
LowPhrase 18.81 (9.2%) 18.51 (4.8%)
-1.6% ( -14% - 13%)
Fuzzy1 72.76 (12.2%) 71.65 (9.6%)
-1.5% ( -20% - 23%)
MedPhrase 54.31 (11.0%) 53.81 (3.2%)
-0.9% ( -13% - 14%)
LowTerm 806.00 (11.9%) 808.20 (4.5%)
0.3% ( -14% - 18%)
Respell 55.89 (10.2%) 56.57 (4.2%)
1.2% ( -11% - 17%)
OrNotHighLow 1102.88 (11.4%) 1116.63 (4.3%)
1.2% ( -13% - 19%)
LowSpanNear 9.48 (9.5%) 9.61 (4.4%)
1.4% ( -11% - 16%)
LowSloppyPhrase 71.86 (8.8%) 72.89 (3.5%)
1.4% ( -9% - 15%)
MedSloppyPhrase 29.92 (10.3%) 30.35 (4.2%)
1.4% ( -11% - 17%)
MedSpanNear 79.24 (8.6%) 80.39 (3.2%)
1.5% ( -9% - 14%)
IntNRQ 16.81 (9.4%) 17.06 (6.1%)
1.5% ( -12% - 18%)
HighSloppyPhrase 23.27 (11.6%) 23.64 (8.1%)
1.6% ( -16% - 24%)
OrHighHigh 16.79 (10.6%) 17.08 (7.7%)
1.7% ( -15% - 22%)
OrHighNotLow 84.84 (10.3%) 86.32 (3.2%)
1.7% ( -10% - 17%)
OrNotHighHigh 56.28 (9.4%) 57.30 (1.9%)
1.8% ( -8% - 14%)
HighTerm 123.91 (10.8%) 126.29 (2.8%)
1.9% ( -10% - 17%)
MedTerm 243.44 (11.1%) 248.40 (2.9%)
2.0% ( -10% - 18%)
Wildcard 74.84 (9.9%) 76.36 (3.1%)
2.0% ( -9% - 16%)
OrHighNotHigh 45.48 (9.9%) 46.47 (1.9%)
2.2% ( -8% - 15%)
OrHighLow 79.36 (11.3%) 81.10 (6.5%)
2.2% ( -14% - 22%)
Prefix3 74.29 (10.5%) 75.96 (4.9%)
2.2% ( -11% - 19%)
OrHighNotMed 53.37 (10.7%) 54.62 (2.5%)
2.3% ( -9% - 17%)
PKLookup 266.92 (10.4%) 273.30 (3.4%)
2.4% ( -10% - 18%)
HighSpanNear 19.64 (10.4%) 20.11 (3.0%)
2.4% ( -9% - 17%)
OrNotHighMed 167.57 (11.7%) 171.67 (2.4%)
2.4% ( -10% - 18%)
OrHighMed 72.90 (12.5%) 74.87 (6.6%)
2.7% ( -14% - 24%)
Fuzzy2 50.70 (13.8%) 52.58 (8.4%)
3.7% ( -16% - 30%)
AndHighMed 160.13 (10.1%) 169.60 (3.4%)
5.9% ( -6% - 21%)
AndHighHigh 69.49 (8.8%) 74.19 (3.3%)
6.8% ( -4% - 20%)
{noformat}
> two phase intersection
> ----------------------
>
> Key: LUCENE-6198
> URL: https://issues.apache.org/jira/browse/LUCENE-6198
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Robert Muir
> Attachments: LUCENE-6198.patch, LUCENE-6198.patch, LUCENE-6198.patch
>
>
> Currently some scorers have to do a lot of per-document work to determine if
> a document is a match. The simplest example is a phrase scorer, but there are
> others (spans, sloppy phrase, geospatial, etc).
> Imagine a conjunction with two MUST clauses, one that is a term that matches
> all odd documents, another that is a phrase matching all even documents.
> Today this conjunction will be very expensive, because the zig-zag
> intersection is reading a ton of useless positions.
> The same problem happens with filteredQuery and anything else that acts like
> a conjunction.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]