[ 
https://issues.apache.org/jira/browse/SOLR-11159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16103386#comment-16103386
 ] 

Yonik Seeley commented on SOLR-11159:
-------------------------------------

I don't see any incorrect bucket counts, just a missing value of "E"?

Refinement works like the following:
phase 1) collect the top N buckets from each shard and find the global "top N" 
buckets
phase 2) correct the counts of this global "top N" by requesting counts from 
shards that didn't provide a value for each bucket

So while we guarantee correct counts, we don't guarantee that a value is missed 
altogether.
To increase the chances that we get the true global top N, we normally 
overrequest on phase 1.  But in your example, you explicitly disabled 
overrequest.

To fix, simply remove the "overrequest:0" part of your request.
For other requests, you can increase this number to reduce or eliminate the 
chance of missing buckets.

> Facet buckets count still incorrect after passing {refine:true} | SOLR-7542
> ---------------------------------------------------------------------------
>
>                 Key: SOLR-11159
>                 URL: https://issues.apache.org/jira/browse/SOLR-11159
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Facet Module
>            Reporter: Amrit Sarkar
>         Attachments: COUNT_DESC_LIMIT_2, COUNT_DESC_LIMIT_3, DOCS
>
>
> I was experimenting / analysing the new *Refinement* feature in JSON Facet 
> Apis introduced in SOLR-7452. Passing {{refine:true}} with the facet 
> definition.
> I am listing down the test-scenarios along with test-data:
> 3 sharded collection on 3 nodes
> node/shard:          bucketVal - count
> 8987:       C - 1
> 8983:       C - 4       D - 1       E - 1       A - 1
> 8985:       E - 2       A - 1       D - 1
> Total: BUCKETS
> C - 5       E - 3       D - 2       A - 2
> It is giving accurate results for COUNT ASC, LIMIT 1 - 4
> {code}
> curl http://localhost:8983/solr/collection1/select -d 
> 'q=*:*&json.facet={cat_s:{type:terms,field:cat_s,sort:"count 
> asc",limit:1,overrequest:0,refine:true}}&wt=json&indent=true'
> {code}
> {code}
>   "facets":{
>     "count":12,
>     "cat_s":{
>       "buckets":[{
>           "val":"A",
>           "count":2}]}}}
> {code}
> {code}
> curl http://localhost:8983/solr/collection1/select -d 
> 'q=*:*&json.facet={cat_s:{type:terms,field:cat_s,sort:"count 
> asc",limit:2,overrequest:0,refine:true}}&wt=json&indent=true'
> {code}
> {code}
>   "facets":{
>     "count":12,
>     "cat_s":{
>       "buckets":[{
>           "val":"A",
>           "count":2},
>         {
>           "val":"D",
>           "count":2}]}}}
> {code}
> *BUT, COUNT DESC, LIMIT 2 and 3*
> {code}
> curl http://localhost:8983/solr/collection1/select -d 
> 'q=*:*&json.facet={cat_s:{type:terms,field:cat_s,sort:"count 
> desc",limit:2,overrequest:0,refine:true}}&wt=json&indent=true'
> {code}
> {code}
>   "facets":{
>     "count":12,
>     "cat_s":{
>       "buckets":[{
>           "val":"C",
>           "count":5},
>         {
>           "val":"A",
>           "count":2}]}}}
> {code}
> {code}
> curl http://localhost:8983/solr/collection1/select -d 
> 'q=*:*&json.facet={cat_s:{type:terms,field:cat_s,sort:"count 
> desc",limit:3,overrequest:0,refine:true}}&wt=json&indent=true'
> {code}
> {code}
>   "facets":{
>     "count":12,
>     "cat_s":{
>       "buckets":[{
>           "val":"C",
>           "count":5},
>         {
>           "val":"A",
>           "count":2},
>         {
>           "val":"D",
>           "count":2}]}}}
> {code}
> *bucketVal {{E}} and its count {{3}} is not in facet response* Pardon me if I 
> am missing some configuration or this behavior is right / justified. Ideally 
> we should see bucketVal E and its count 3.
> I am attaching Index DOCS, debugQuery for COUNT DESC, LIMIT 2 and LIMIT 3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to