[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses

Michael McCandless (JIRA) Sun, 08 Jun 2014 15:55:24 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14021471#comment-14021471
 ]


Michael McCandless commented on LUCENE-4396:
--------------------------------------------

Thanks for the detailed explanation on the score differences Da.  I think you 
are right that order-of-operations (with float casts in between) explains the 
score differences.

{quote}
Do you mean merging And.tasks and AndOr.tasks ? If so, there's no need to
do that, because And.tasks contains all tasks in AndOr.tasks, although tasks'
names are changed.
{quote}

Ahh, OK nevermind.

bq. However, linked list can be helpful when required docs is extremly sparse.

True, but maybe in such cases (low freqs for the clauses) we should just use 
BS2.  I think BS/BNS do better for high-freq clauses?

{quote}
bq. If there's only 1 required clause sent to BS/BNS can't we use its scorer 
instead? Have you explored having BS interact directly with all the MUST 
clauses, rather than using ConjunctionScorer?

Hmm. I don't think that would be helpful. The reason is just the same as above.
{quote}

I think we may get better performance when the MUST clauses are high freq, if 
we just use BooleanScorer to enumerate all the matching docs for each MUST 
instead of going through ConjunctionScorer?

> BooleanScorer should sometimes be used for MUST clauses
> -------------------------------------------------------
>
>                 Key: LUCENE-4396
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4396
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>         Attachments: And.tasks, AndOr.tasks, AndOr.tasks, LUCENE-4396.patch, 
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, 
> LUCENE-4396.patch, luceneutil-score-equal.patch, luceneutil-score-equal.patch
>
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared 
> to the other clauses, that BooleanScorer would perform better than 
> BooleanScorer2.  BooleanScorer still has some vestiges from when it used to 
> handle MUST so it shouldn't be hard to bring back this capability ... I think 
> the challenging part might be the heuristics on when to use which (likely we 
> would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs 
> in this case, eg if suddenly the MUST clause skips 1000000 docs then you want 
> to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you 
> are inspired!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses

Reply via email to