[ 
https://issues.apache.org/jira/browse/LUCENE-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16890760#comment-16890760
 ] 

Atri Sharma commented on LUCENE-8929:
-------------------------------------

bq. So you need to collect each segment at least until {numHits} hits have been 
collected, or until the last collected hit was not competitive globally 
(whichever comes first)

Yeah, sorry I was not clear. Per collector, we will collect until numHits hits 
are collected.

I have opened a PR implementing the same: 
https://github.com/apache/lucene-solr/pull/803

Hoping the code gives more clarity

> Early Terminating CollectorManager
> ----------------------------------
>
>                 Key: LUCENE-8929
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8929
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Atri Sharma
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should have an early terminating collector manager which accurately tracks 
> hits across all of its collectors and determines when there are enough hits, 
> allowing all the collectors to abort.
> The options for the same are:
> 1) Shared total count : Global "scoreboard" where all collectors update their 
> current hit count. At the end of each document's collection, collector checks 
> if N > threshold, and aborts if true
> 2) State Reporting Collectors: Collectors report their total number of counts 
> collected periodically using a callback mechanism, and get a proceed or abort 
> decision.
> 1) has the overhead of synchronization in the hot path, 2) can collect 
> unnecessary hits before aborting.
> I am planning to work on 2), unless objections



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to