[ 
https://issues.apache.org/jira/browse/SOLR-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365690#comment-16365690
 ] 

John Montgomery commented on SOLR-7939:
---------------------------------------

I've looked into this a bit and it appears to be due to 
{{TopGroupsShardResponseProcessor}} using the wrong value when merging results. 
 It appears to be using the value from {{rows}} instead of the one from 
{{group.limit}}.

 

In particular this line:

[https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/grouping/distributed/responseprocessor/TopGroupsShardResponseProcessor.java#L176]

 

Which is:
{code:java}
int topN = rb.getGroupingSpec().getOffset() + 
rb.getGroupingSpec().getLimit();{code}
Roughly speaking it should be something like:
{code:java}
int topN = rb.getGroupingSpec().getWithinGroupOffset() + 
rb.getGroupingSpec().getWithinGroupLimit(){code}
>From local testing that fix does work (correct group count returned etc).  
>However the tests in {{TestDistributedGrouping}} then fail.  This appears to 
>be due to them passing {{group.limit=-1}}.  I tried to update the code to 
>allow for that, but the tests still fail with a mismatch between the number of 
>results from the distributed vs non-distributed search (6 vs 7):
{code:java}
junit.framework.AssertionFailedError: .grouped[a_t:kings OR 
a_t:eggs].doclist.size():6!=7{code}
So it looks like the problem fix may need to be a bit more complex.

We were evaluating using groups for some functionality, but I think we can 
achieve similar results with sub-faceting by id and a 2nd request to solr.

> Result Grouping: Number of results in group is not according to specs
> ---------------------------------------------------------------------
>
>                 Key: SOLR-7939
>                 URL: https://issues.apache.org/jira/browse/SOLR-7939
>             Project: Solr
>          Issue Type: Bug
>          Components: search, SolrCloud
>    Affects Versions: 4.7.1
>            Reporter: Esther Goldbraich
>            Priority: Major
>
> When using result grouping (group=true), Solr specs state the following about 
> the "rows" and "group.limit" params:
> rows - The number of groups to return.
> group.limit -  Number of rows to return in each group.
> We are using Solr cloud with a single collection and 64 shards. 
> When grouping by field (i.e. using the group.field parameter), the behavior 
> is as expected.
> However, when grouping by query (using group.query), the number of documents 
> inside each group is affected by the rows param, instead of the group.limit 
> param.
> This is different than what is mentioned in the specs.
> Moreover, when switching to a non-sharded environment (64 collections, 1 
> shard per collection), the behavior is different, and the number of docs 
> inside each group is affected by the group.query param, as expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to