dsmiley commented on a change in pull request #1501: URL: https://github.com/apache/lucene-solr/pull/1501#discussion_r422539014
########## File path: solr/solr-ref-guide/src/common-query-parameters.adoc ########## @@ -361,3 +361,42 @@ This is what happens if a similar request is sent that adds `echoParams=all` to } } ---- + +== minExactHits Parameter Review comment: Wouldn't "minNumFoundExact" be a better parameter name because it aligns with the "numFound" that we expose? I searched for "hits" in Solr string literals and the ref guide. It's not used much; more often refers to cache hits. ########## File path: solr/solr-ref-guide/src/common-query-parameters.adoc ########## @@ -361,3 +361,42 @@ This is what happens if a similar request is sent that adds `echoParams=all` to } } ---- + +== minExactHits Parameter +When this parameter is used, Solr will count the number of hits accurately at least until this value. After that, Solr can skip over documents that don't have a score high enough to enter in the top N. This can greatly improve performance o search queries. On the other hand, when this parameter is used, the `numFound` may not be exact, and may instead be an approximation. +The `numFoundExact` boolean attribute is included in all responses, indicating if the `numFound` value is exact or an approximation. If it's an approximation, the real number of hits for the query is guaranteed to be greater or equal `numFound`. + +More about approximate document counting and `minExactHits`: +* The documents returned in the response are guaranteed to be the docs with the top scores. This parameter will not skip documents that are to be returned in the response, it will only skip counting docs that, while they match the query, their score is low enough to not be in the top N. +* Providing `minExactHits` doesn't guarantee that Solr will use approximate hit counting (and thus, provide the speedup). Some types of queries, or other parameters (like if facets are requested) will require accurate counting. The value of `numFoundExact` indicates if the approximation was used or not. +* Approximate counting can only be used when sorting by `score desc` first (which is the default sort in Solr). Other fields can be used after `score desc`, but if any other type of sorting is used before score, then the approximation won't be applied. +* When doing distributed queries across multiple shards, each shard will accurately count hits until `minExactHits` (which means the query could be hitting `numShards * minExactHits` docs and `numFound` in the response would still be accurate) +For example: + +[source,text] +q=quick brown fox&minExactHits=100&rows=10 + +[source,json] +---- +"response": { + "numFound": 153, + "start": 0, + "hitCountExact": false, Review comment: didn't we agree on "numFoundExact" ? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org