[
https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100492#comment-14100492
]
Michael McCandless commented on LUCENE-4396:
--------------------------------------------
OK indeed I see effectively no perf diffs for the default tasks:
{noformat}
Report after iter 19:
Task QPS base StdDev QPS comp StdDev
Pct diff
LowTerm 159.50 (18.8%) 157.56 (17.0%)
-1.2% ( -31% - 42%)
LowPhrase 9.13 (2.3%) 9.10 (3.0%)
-0.3% ( -5% - 5%)
HighPhrase 22.96 (3.2%) 22.89 (4.0%)
-0.3% ( -7% - 7%)
MedPhrase 20.96 (2.6%) 20.91 (3.5%)
-0.2% ( -6% - 6%)
LowSloppyPhrase 9.01 (4.2%) 9.02 (4.3%)
0.1% ( -8% - 8%)
Fuzzy1 34.93 (4.3%) 34.96 (5.2%)
0.1% ( -9% - 10%)
Respell 23.59 (2.9%) 23.64 (2.9%)
0.2% ( -5% - 6%)
MedSloppyPhrase 27.69 (5.1%) 27.76 (4.8%)
0.3% ( -9% - 10%)
HighSloppyPhrase 6.39 (6.3%) 6.41 (6.4%)
0.3% ( -11% - 13%)
AndHighMed 39.17 (1.9%) 39.30 (2.1%)
0.4% ( -3% - 4%)
MedTerm 76.73 (9.0%) 77.02 (8.6%)
0.4% ( -15% - 19%)
AndHighHigh 15.19 (1.6%) 15.26 (2.4%)
0.4% ( -3% - 4%)
MedSpanNear 4.14 (4.7%) 4.16 (5.7%)
0.4% ( -9% - 11%)
HighSpanNear 1.49 (3.3%) 1.50 (4.6%)
0.5% ( -7% - 8%)
LowSpanNear 8.60 (6.0%) 8.67 (7.5%)
0.8% ( -11% - 15%)
HighTerm 13.12 (8.6%) 13.24 (10.1%)
0.9% ( -16% - 21%)
OrHighMed 15.47 (6.3%) 15.62 (6.0%)
0.9% ( -10% - 14%)
OrNotHighHigh 8.61 (7.2%) 8.70 (6.9%)
1.1% ( -12% - 16%)
OrHighNotLow 26.60 (5.8%) 26.95 (5.9%)
1.3% ( -9% - 13%)
OrHighNotMed 14.53 (6.6%) 14.72 (6.1%)
1.3% ( -10% - 15%)
OrHighLow 12.25 (6.5%) 12.42 (6.9%)
1.4% ( -11% - 15%)
OrHighNotHigh 4.06 (7.3%) 4.12 (6.5%)
1.4% ( -11% - 16%)
Prefix3 30.14 (3.5%) 30.58 (4.2%)
1.4% ( -6% - 9%)
OrHighHigh 18.13 (6.1%) 18.40 (6.1%)
1.5% ( -10% - 14%)
OrNotHighLow 14.43 (7.6%) 14.65 (7.6%)
1.5% ( -12% - 18%)
Wildcard 15.10 (4.2%) 15.34 (6.2%)
1.6% ( -8% - 12%)
Fuzzy2 20.01 (4.0%) 20.39 (3.7%)
1.9% ( -5% - 10%)
AndHighLow 278.92 (3.2%) 284.79 (3.7%)
2.1% ( -4% - 9%)
OrNotHighMed 5.10 (7.8%) 5.22 (7.6%)
2.2% ( -12% - 19%)
IntNRQ 1.63 (5.7%) 1.69 (10.3%)
3.7% ( -11% - 20%)
{noformat}
I'll run with And.tasks next...
> BooleanScorer should sometimes be used for MUST clauses
> -------------------------------------------------------
>
> Key: LUCENE-4396
> URL: https://issues.apache.org/jira/browse/LUCENE-4396
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: And.tasks, And.tasks, AndOr.tasks, AndOr.tasks,
> LUCENE-4396-simple.patch, LUCENE-4396-simple.patch, LUCENE-4396-simple.patch,
> LUCENE-4396-simple.patch, LUCENE-4396.patch, LUCENE-4396.patch,
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch,
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch,
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch,
> LUCENE-4396.patch, LUCENE-4396.patch, SIZE.perf, all.perf,
> luceneutil-score-equal.patch, luceneutil-score-equal.patch,
> merge-simple.perf, merge-simple.png, merge.perf, merge.png, perf.png,
> stat.cpp, stat.cpp, tasks.cpp
>
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared
> to the other clauses, that BooleanScorer would perform better than
> BooleanScorer2. BooleanScorer still has some vestiges from when it used to
> handle MUST so it shouldn't be hard to bring back this capability ... I think
> the challenging part might be the heuristics on when to use which (likely we
> would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs
> in this case, eg if suddenly the MUST clause skips 1000000 docs then you want
> to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you
> are inspired!
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]