Tomas Eduardo Fernandez Lobbe created SOLR-15390:
----------------------------------------------------
Summary: Deprecate "segmentTerminateEarly" in favor of
"minExactCount"
Key: SOLR-15390
URL: https://issues.apache.org/jira/browse/SOLR-15390
Project: Solr
Issue Type: Improvement
Security Level: Public (Default Security Level. Issues are Public)
Reporter: Tomas Eduardo Fernandez Lobbe
When
[segmentTerminateEarly|https://solr.apache.org/guide/8_8/common-query-parameters.html#segmentterminateearly-parameter]
is set to true in Solr queries and the index is sorted by the same field used
for sorting the query, Solr can skip over documents once it has collected
enough, saving some time at the cost of not having the exact document hit count.
{{TopFieldCollector}} can also achieve this now, and can do it more
efficiently, since it can keep track of the hits across segments too.
{{EarlyTerminatingSortingCollector}} (the collector used when
"segmentTerminateEarly" is set to true) has been deprecated from Lucene, and
will eventually be removed, so we'll have the option of copying it to Solr, or
get rid of the functionality. I think we should deprecate the functionality in
favor of "minExactCount".
If we want to keep the parameter, one thing we could also do is, when
"segmentTerminateEarly" is set to true, internally set "minExactCount" to
something low (0 should work) and get rid of the code using
{{EarlyTerminatingSortingCollector}}, but I think we should just remove the
parameter for a simpler API (maybe we do want to do this for back
compatibility).
I've ran some perf tests to validate and In all cases, minExactCount did equal
or slightly better than segmentTerminateEarly in sorted indices
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]