[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses

Da Huang (JIRA) Fri, 16 May 2014 08:27:33 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998736#comment-13998736
 ]


Da Huang commented on LUCENE-4396:
----------------------------------

Thanks for your suggestions!

{quote}
maybe we could test on fewer terms, for the
Low/HighAndManyLow/High tasks? I think it's more common to have a
handful (3-5 maybe) of terms.
{quote}
When terms are few, BooleanNovelScorer performs slower than BS (about -10%).
However, I have to generate tasks with fewer terms and rerun the tasks to 
reconfirm
the specific perf. difference.

{quote}
 But maybe keep your current category
and rename it to Tons instead of Many?
{quote}
OK, I will do so.

{quote}
Maybe we can improve
the test so that it exercises BS and NBS? E.g., toggle the "require
docs in order" via a custom collector?
{quote}
Yes, I think that's a good idea.

{quote}
Hmm do we know why the scores changed?
{quote}
Yes, it's because the calculating orders are different. 
BS adds up scores of all SHOULD clauses, and then add their sum to the final 
score.
BNS adds score of each SHOULD clause to final score one by one.

{quote}
Are we comparing BS2 to NovelBS?
{quote}
Yes.

{quote}
I think BS and BS2 already have different scores today?
{quote}
Yes. Actually, the score calculating order of BS is the same as BNS.

{quote}
but you commented this out in your patch in order to test NBS I
guess?
{quote}
yes, I did that in order to test BNS. Otherwise, luceneutil would throw 
exception.

{quote}
Do you have any perf results of BS w/ required clauses (as a
BulkScorer) vs BS2 (what trunk does today)?
{quote}
Hmm, I haven't carried out such experiment yet. Checking the perf. results of 
BS vs BS2 
is a good idea. I will do that.  :) 



> BooleanScorer should sometimes be used for MUST clauses
> -------------------------------------------------------
>
>                 Key: LUCENE-4396
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4396
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>         Attachments: AndOr.tasks, LUCENE-4396.patch, LUCENE-4396.patch, 
> LUCENE-4396.patch, LUCENE-4396.patch, luceneutil-score-equal.patch
>
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared 
> to the other clauses, that BooleanScorer would perform better than 
> BooleanScorer2.  BooleanScorer still has some vestiges from when it used to 
> handle MUST so it shouldn't be hard to bring back this capability ... I think 
> the challenging part might be the heuristics on when to use which (likely we 
> would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs 
> in this case, eg if suddenly the MUST clause skips 1000000 docs then you want 
> to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you 
> are inspired!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses

Reply via email to