[
https://issues.apache.org/jira/browse/LUCENE-8727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888766#comment-16888766
]
Atri Sharma commented on LUCENE-8727:
-------------------------------------
[~jpountz] Here are two thoughts for the implementation of same:
1) Shared Priority Queue: A shared priority queue which is held in parent
CollectorManager is used by all Collectors. This flows down naturally since
post collection of top N hits globally, the minimum competitive score can be
increased without Collectors getting involved and further hits will be ranked
accordingly. However, the downside is that the priority queue implementation
will have to be synchronized, so there can be performance hit as the critical
path of segment collection will be affected.
2) Alternate way can be that for N hits, each slice gets an equal number of
prorated hits to start with (M collectors, so N/M hits). Each Collector gets a
callback supplier which the Collector will call with the number of hits
collected till the point and the score of the highest scoring local hit. The
callback will return the minimum competitive hit globally seen till now, and
the Collector will use that score to filter out remaining hits. The point in
time when a Collector calls the callback mechanism can be relative, simplest
being after each N/M hits. The callback will be provided by the
CollectorManager. The downside of this approach is that there is communication
involved between Collectors and CollectorManager, and some redundant hits can
be collected due to the periodic callback invocation. In contrast, the shared
priority queue mechanism allows for accurate filtering.
WDYT?
> IndexSearcher#search(Query,int) should operate on a shared priority queue
> when configured with an executor
> ----------------------------------------------------------------------------------------------------------
>
> Key: LUCENE-8727
> URL: https://issues.apache.org/jira/browse/LUCENE-8727
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
>
> If IndexSearcher is configured with an executor, then the top docs for each
> slice are computed separately before being merged once the top docs for all
> slices are computed. With block-max WAND this is a bit of a waste of
> resources: it would be better if an increase of the min competitive score
> could help skip non-competitive hits on every slice and not just the current
> one.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]