[
https://issues.apache.org/jira/browse/SOLR-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365690#comment-16365690
]
John Montgomery commented on SOLR-7939:
---------------------------------------
I've looked into this a bit and it appears to be due to
{{TopGroupsShardResponseProcessor}} using the wrong value when merging results.
It appears to be using the value from {{rows}} instead of the one from
{{group.limit}}.
In particular this line:
[https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/grouping/distributed/responseprocessor/TopGroupsShardResponseProcessor.java#L176]
Which is:
{code:java}
int topN = rb.getGroupingSpec().getOffset() +
rb.getGroupingSpec().getLimit();{code}
Roughly speaking it should be something like:
{code:java}
int topN = rb.getGroupingSpec().getWithinGroupOffset() +
rb.getGroupingSpec().getWithinGroupLimit(){code}
>From local testing that fix does work (correct group count returned etc).
>However the tests in {{TestDistributedGrouping}} then fail. This appears to
>be due to them passing {{group.limit=-1}}. I tried to update the code to
>allow for that, but the tests still fail with a mismatch between the number of
>results from the distributed vs non-distributed search (6 vs 7):
{code:java}
junit.framework.AssertionFailedError: .grouped[a_t:kings OR
a_t:eggs].doclist.size():6!=7{code}
So it looks like the problem fix may need to be a bit more complex.
We were evaluating using groups for some functionality, but I think we can
achieve similar results with sub-faceting by id and a 2nd request to solr.
> Result Grouping: Number of results in group is not according to specs
> ---------------------------------------------------------------------
>
> Key: SOLR-7939
> URL: https://issues.apache.org/jira/browse/SOLR-7939
> Project: Solr
> Issue Type: Bug
> Components: search, SolrCloud
> Affects Versions: 4.7.1
> Reporter: Esther Goldbraich
> Priority: Major
>
> When using result grouping (group=true), Solr specs state the following about
> the "rows" and "group.limit" params:
> rows - The number of groups to return.
> group.limit - Number of rows to return in each group.
> We are using Solr cloud with a single collection and 64 shards.
> When grouping by field (i.e. using the group.field parameter), the behavior
> is as expected.
> However, when grouping by query (using group.query), the number of documents
> inside each group is affected by the rows param, instead of the group.limit
> param.
> This is different than what is mentioned in the specs.
> Moreover, when switching to a non-sharded environment (64 collections, 1
> shard per collection), the behavior is different, and the number of docs
> inside each group is affected by the group.query param, as expected.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]