[
https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085276#comment-14085276
]
Paul Elschot commented on LUCENE-4396:
--------------------------------------
bq. By the perf. table of BQ, it looks that BQ perfs low on the first 4 cases.
However, when I run these cases one by one, they're just worse than the trunk
within 2%.
The only reason I can think of that might cause this is that in the mixed case
there is less effective use of caches (L1, L2, L3) during the test. When this
is the case the max difference of 20% should go down when running for example
twice as many of the same queries in one go during the mixed test.
Btw. BooleanScorer2 is now almost ten years old, see LUCENE-294. Lots of
improvements were made since then and I'm happy to see some further possible
performance improvements here.
> BooleanScorer should sometimes be used for MUST clauses
> -------------------------------------------------------
>
> Key: LUCENE-4396
> URL: https://issues.apache.org/jira/browse/LUCENE-4396
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: And.tasks, And.tasks, AndOr.tasks, AndOr.tasks,
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch,
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch,
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch,
> LUCENE-4396.patch, SIZE.perf, all.perf, luceneutil-score-equal.patch,
> luceneutil-score-equal.patch, stat.cpp, stat.cpp, tasks.cpp
>
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared
> to the other clauses, that BooleanScorer would perform better than
> BooleanScorer2. BooleanScorer still has some vestiges from when it used to
> handle MUST so it shouldn't be hard to bring back this capability ... I think
> the challenging part might be the heuristics on when to use which (likely we
> would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs
> in this case, eg if suddenly the MUST clause skips 1000000 docs then you want
> to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you
> are inspired!
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]