[ 
https://issues.apache.org/jira/browse/SOLR-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296621#comment-15296621
 ] 

Dennis Gove commented on SOLR-8988:
-----------------------------------

Just to slightly rephrase the salient point here:

Consider you asked for up to 10 terms from shardA with mincount=1 but you 
received only 5 terms back. In this case you know, definitively, that a term 
seen in the response from shardB but not in the response from shardA could have 
at most a count of 0 in shardA. If it had any other count in shardA then it 
would have been returned in the response from shardA.

Also, if you asked for up to 10 terms from shardA with mincount=1 and you get 
back a response with 10 terms having a count >= 1 then the response is 
identical to the one you'd have received if mincount=0. 

Because of this, there isn't a scenario where the response would result in more 
work than would have been required if mincount=0. For this reason, the decrease 
in required work when mincount=1 is *always* either a moot point or a net win.

> Improve facet.method=fcs performance in SolrCloud
> -------------------------------------------------
>
>                 Key: SOLR-8988
>                 URL: https://issues.apache.org/jira/browse/SOLR-8988
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Keith Laban
>         Attachments: SOLR-8988.patch, SOLR-8988.patch, SOLR-8988.patch, 
> SOLR-8988.patch, Screen Shot 2016-04-25 at 2.54.47 PM.png, Screen Shot 
> 2016-04-25 at 2.55.00 PM.png
>
>
> This relates to SOLR-8559 -- which improves the algorithm used by fcs 
> faceting when {{facet.mincount=1}}
> This patch allows {{facet.mincount}} to be sent as 1 for distributed queries. 
> As far as I can tell there is no reason to set {{facet.mincount=0}} for 
> refinement purposes . After trying to make sense of all the refinement logic, 
> I cant see how the difference between _no value_ and _value=0_ would have a 
> negative effect.
> *Test perf:*
> - ~15million unique terms
> - query matches ~3million documents
> *Params:*
> {code}
> facet.mincount=1
> facet.limit=500
> facet.method=fcs
> facet.sort=count
> {code}
> *Average Time Per Request:*
> - Before patch:  ~20seconds
> - After patch: <1 second
> *Note*: all tests pass and in my test, the output was identical before and 
> after patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to