[ 
https://issues.apache.org/jira/browse/LUCENE-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17564885#comment-17564885
 ] 

Adrien Grand commented on LUCENE-10480:
---------------------------------------

I haven't tried to reproduce it but the steps you took by running on wikibigall 
with the nightly tasks file sound good to me. Another thing that changes 
performance sometimes is the doc ID order, were you using multiple indexing 
threads maybe?

Ignoring the fact that we cannot reproduce the slowdown, if I try to think of 
the main differences between WANDScorer and BlockMaxMaxscoreScorer for 
AndHighOrMedMed, I think the main one is the way that {{advanceShallow}} is 
computed. Conjunctions use block boundaries of the clause that has the lowest 
cost, so this could explain why we are seeing a slowdown with AndHighOrMedMed 
(since the conjunction uses block boundaries of OrMedMed) and not 
AndMedOrHighHigh (since the conjunction uses block boundaries of Med). Maybe we 
could explore other approaches for {{advanceShallow}} such as taking the 
minimum block boundary across essential clauses only instead of all clauses.

> Specialize 2-clauses disjunctions
> ---------------------------------
>
>                 Key: LUCENE-10480
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10480
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Priority: Minor
>          Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> WANDScorer is nice, but it also has lots of overhead to maintain its 
> invariants: one linked list for the current candidates, one priority queue of 
> scorers that are behind, another one for scorers that are ahead. All this 
> could be simplified in the 2-clauses case, which feels worth specializing for 
> as it's very common that end users enter queries that only have two terms?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to