[
https://issues.apache.org/jira/browse/SOLR-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296621#comment-15296621
]
Dennis Gove commented on SOLR-8988:
-----------------------------------
Just to slightly rephrase the salient point here:
Consider you asked for up to 10 terms from shardA with mincount=1 but you
received only 5 terms back. In this case you know, definitively, that a term
seen in the response from shardB but not in the response from shardA could have
at most a count of 0 in shardA. If it had any other count in shardA then it
would have been returned in the response from shardA.
Also, if you asked for up to 10 terms from shardA with mincount=1 and you get
back a response with 10 terms having a count >= 1 then the response is
identical to the one you'd have received if mincount=0.
Because of this, there isn't a scenario where the response would result in more
work than would have been required if mincount=0. For this reason, the decrease
in required work when mincount=1 is *always* either a moot point or a net win.
> Improve facet.method=fcs performance in SolrCloud
> -------------------------------------------------
>
> Key: SOLR-8988
> URL: https://issues.apache.org/jira/browse/SOLR-8988
> Project: Solr
> Issue Type: Improvement
> Reporter: Keith Laban
> Attachments: SOLR-8988.patch, SOLR-8988.patch, SOLR-8988.patch,
> SOLR-8988.patch, Screen Shot 2016-04-25 at 2.54.47 PM.png, Screen Shot
> 2016-04-25 at 2.55.00 PM.png
>
>
> This relates to SOLR-8559 -- which improves the algorithm used by fcs
> faceting when {{facet.mincount=1}}
> This patch allows {{facet.mincount}} to be sent as 1 for distributed queries.
> As far as I can tell there is no reason to set {{facet.mincount=0}} for
> refinement purposes . After trying to make sense of all the refinement logic,
> I cant see how the difference between _no value_ and _value=0_ would have a
> negative effect.
> *Test perf:*
> - ~15million unique terms
> - query matches ~3million documents
> *Params:*
> {code}
> facet.mincount=1
> facet.limit=500
> facet.method=fcs
> facet.sort=count
> {code}
> *Average Time Per Request:*
> - Before patch: ~20seconds
> - After patch: <1 second
> *Note*: all tests pass and in my test, the output was identical before and
> after patch.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]