[ 
https://issues.apache.org/jira/browse/LUCENE-8727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888766#comment-16888766
 ] 

Atri Sharma commented on LUCENE-8727:
-------------------------------------

[~jpountz] Here are two thoughts for the implementation of same:

 

1) Shared Priority Queue: A shared priority queue which is held in parent 
CollectorManager is used by all Collectors. This flows down naturally since 
post collection of top N hits globally, the minimum competitive score can be 
increased without Collectors getting involved and further hits will be ranked 
accordingly. However, the downside is that the priority queue implementation 
will have to be synchronized, so there can be performance hit as the critical 
path of segment collection will be affected.

 

2) Alternate way can be that for N hits, each slice gets an equal number of 
prorated hits to start with (M collectors, so N/M hits). Each Collector gets a 
callback supplier which the Collector will call with the number of hits 
collected till the point and the score of the highest scoring local hit. The 
callback will return the minimum competitive hit globally seen till now, and 
the Collector will use that score to filter out remaining hits. The point in 
time when a Collector calls the callback mechanism can be relative, simplest 
being after each N/M hits. The callback will be provided by the 
CollectorManager. The downside of this approach is that there is communication 
involved between Collectors and CollectorManager, and some redundant hits can 
be collected due to the periodic callback invocation. In contrast, the shared 
priority queue mechanism allows for accurate filtering.

 

WDYT?

> IndexSearcher#search(Query,int) should operate on a shared priority queue 
> when configured with an executor
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-8727
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8727
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>
> If IndexSearcher is configured with an executor, then the top docs for each 
> slice are computed separately before being merged once the top docs for all 
> slices are computed. With block-max WAND this is a bit of a waste of 
> resources: it would be better if an increase of the min competitive score 
> could help skip non-competitive hits on every slice and not just the current 
> one.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to