atris commented on issue #823: LUCENE-8939: Introduce Shared Count Early 
Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#issuecomment-527285002
 
 
   > > https://issues.apache.org/jira/browse/LUCENE-8681
   > 
   > I am not sure I see how that solves the problem?
   > 
   > The core change targeted in this PR is allowing `TOTAL_HITS_THRESHOLD` to 
be accurately counted across all slices. Today, we collect 
`TOTAL_HITS_THRESHOLD` per slice, which is not what the API definition is. Post 
this PR, we will collect `TOTAL_HITS_THRESHOLD` in aggregate. 
`TOTAL_HITS_THRESHOLD`' s definition does not guarantee any order of collection 
of hits in the concurrent case -- we inadvertently define one today by 
collection the threshold number of hits per slice.
   > 
   > RE: Proration, I believe that is a custom logic that can be added on top 
of this change. In any case, the proration logic also works on a bunch of 
static values + fudge factors, so it can go wrong and we might end up 
collecting lesser hits from a more valuable segment. To help prevent this 
scenario, I believe proration might also do well to build upon this PR and use 
the shared counter. But, I am unable to see why proration and accurate counting 
across slices are mutually exclusive.
   > 
   > In any case, unlike proration, this PR does not propose any algorithmic 
changes to the way collection is done -- it simply reduces extra work done 
across slices that we do not even advertise today, so might be something that 
the user is unaware of.
   
   To summarize my monologue, this PR is aimed at accurate counting of hits 
across all slices -- whereas proration targets a different use case of  trying 
to "distributing" hits across slices based on some parameters.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to