Hoss Man created SOLR-11729:
-------------------------------
Summary: Increase default overrequest ratio/count in json.facet to
match existing defaults for facet.overrequest.ratio & facet.overrequest.count ?
Key: SOLR-11729
URL: https://issues.apache.org/jira/browse/SOLR-11729
Project: Solr
Issue Type: Improvement
Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hoss Man
When FacetComponent first got support for distributed search, the default
"effective shard limit" done on shards followed the formula...
{code}
limit = (int)(dff.initialLimit * 1.5) + 10;
{code}
...over time, this became configurable with the introduction of some expert
level tuning options: {{facet.overrequest.ratio}} & {{facet.overrequest.count}}
-- but the defaults (and basic formula) remain the same to this day...
{code}
this.overrequestRatio
= params.getFieldDouble(field, FacetParams.FACET_OVERREQUEST_RATIO,
1.5);
this.overrequestCount
= params.getFieldInt(field, FacetParams.FACET_OVERREQUEST_COUNT, 10);
...
private int doOverRequestMath(int limit, double ratio, int count) {
// NOTE: normally, "1.0F < ratio"
//
// if the user chooses a ratio < 1, we allow it and don't "bottom out" at
// the original limit until *after* we've also added the count.
int adjustedLimit = (int) (limit * ratio) + count;
return Math.max(limit, adjustedLimit);
}
{code}
However...
When {{json.facet}} multi-shard refinement was added, the code was written
slightly diff:
* there is an explicit {{overrequest:N}} (count) option
* if {{-1 == overrequest}} (which is the default) then an "effective shard
limit" is computed using the same basic formula as in FacetComponet -- _*but
the constants are different*_...
** {{effectiveLimit = (long) (effectiveLimit * 1.1 + 4);}}
* For any (non "-1") user specified {{overrequest}} value, it's added verbatim
to the {{limit}} (which may have been user specified, or may just be the
default)
** {{effectiveLimit += freq.overrequest;}}
Given the design of the {{json.facet}} syntax, I can understand why the code
path for an "advanced" user specified {{overrequest:N}} option avoids using any
(implicit) ratio calculation and just does the straight addition of {{limit +=
overrequest}}.
What I'm not clear on is the choice of the constants {{1.1}} and {{4}} in the
common (default) case, and why those differ from the historically used {{1.5}}
and {{6}}.
----
It may seem like a small thing to worry about, but it can/will cause odd
inconsistencies when people try to migrate simple {{facet.field=foo}} (or
{{facet.pivot=foo,bar}}) queries to {{json.facet}} -- I have also seen it give
people attempting these types of migrations the (mistaken) impression that
discrepancies they are seeing are because {{refine:true}} is not be working.
For this reason, I propose we change the (default) {{overrequest:-1}} behavior
to use the same constants as the equivilent FacetComponent code...
{code}
if (fcontext.isShard()) {
if (freq.overrequest == -1) {
// add over-request if this is a shard request and if we have a small
offset (large offsets will already be gathering many more buckets than needed)
if (freq.offset < 10) {
effectiveLimit = (long) (effectiveLimit * 1.5 + 6);
}
...
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]