[jira] [Comment Edited] (LUCENE-8727) IndexSearcher#search(Query,int) should operate on a shared priority queue when configured with an executor

Mayya Sharipova (JIRA) Fri, 19 Jul 2019 14:29:31 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-8727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16889181#comment-16889181
 ]


Mayya Sharipova edited comment on LUCENE-8727 at 7/19/19 9:28 PM:
------------------------------------------------------------------

Some comments about design option # 1.

I think we should just share  min competitive score(it could be AtomicLong or 
something) between collectors, and not the top hits.  The reason for not 
sharing top hits  is that Collectors expect leaves in the [sequential 
order.|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java#L240-L242]
 And if it happens that we start processing leaves with higher doc Ids first in 
the executor, we may populate the global priority queue with docs with higher 
ids and set the global min competitive score to the next float. Next, when we 
process leaves with smaller doc Ids, as our global priority queue is full and 
as we use this updated global min competitive score, we will have to skip all 
these docs with smaller doc Ids even if they have the same scores as docs with 
higher doc Ids and should be selected instead. 

If all collectors have their own priority queues, they will make sure first to 
fill them to N and only after that set min competitive score. 


was (Author: mayyas):
Some comments about design option # 1.

I think we should just share  min competitive score(it could be AtomicLong or 
something) between collectors, and not the top hits.  The reason for not 
sharing top hits  is that Collectors expect leaves in [the sequential 
order|[https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java#L240-L242]].
 And if it happens that we start processing leaves with higher doc Ids first in 
the executor, we may populate the global priority queue with docs with higher 
ids and set the global min competitive score to the next float. Next, when we 
process leaves with smaller doc Ids, as our global priority queue is full and 
as we use this updated global min competitive score, we will have to skip all 
these docs with smaller doc Ids even if they have the same scores as docs with 
higher doc Ids and should be selected instead. 

If all collectors have their own priority queues, they will make sure first to 
fill them to N and only after that set min competitive score. 

> IndexSearcher#search(Query,int) should operate on a shared priority queue 
> when configured with an executor
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-8727
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8727
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>
> If IndexSearcher is configured with an executor, then the top docs for each 
> slice are computed separately before being merged once the top docs for all 
> slices are computed. With block-max WAND this is a bit of a waste of 
> resources: it would be better if an increase of the min competitive score 
> could help skip non-competitive hits on every slice and not just the current 
> one.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-8727) IndexSearcher#search(Query,int) should operate on a shared priority queue when configured with an executor

Reply via email to