[
https://issues.apache.org/jira/browse/SOLR-11831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470603#comment-16470603
]
Ilayaraja commented on SOLR-11831:
----------------------------------
Does this apply to both distributed and non distributed solr setups?
> Skip second grouping step if group.limit is 1 (aka Las Vegas patch)
> --------------------------------------------------------------------
>
> Key: SOLR-11831
> URL: https://issues.apache.org/jira/browse/SOLR-11831
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Malvina Josephidou
> Priority: Minor
>
> In cases where we do grouping and ask for {{group.limit=1}} only it is
> possible to skip the second grouping step. In our test datasets it improved
> speed by around 40%.
> Essentially, in the first grouping step each shard returns the top K groups
> based on the highest scoring document in each group. The top K groups from
> each shard are merged in the federator and in the second step we ask all the
> shards to return the top documents from each of the top ranking groups.
> If we only want to return the highest scoring document per group we can
> return the top document id in the first step, merge results in the federator
> to retain the top K groups and then skip the second grouping step entirely.
> This is possible provided that:
> a) We do not need to know the total number of matching documents per group
> b) Within group sort and between group sort is the same.
> c) We are not doing reranking (this is because this is done in the second
> grouping step. It is also possible to get this to work with reranking but
> more work and some additional assumptions are required)
>
> This patch applies the grouping optimisation in cases where a)-c) apply and
> we are only sorting by relevance. It is also possible to extend this work to
> handle multiple sorting criteria and also reranking.
> P.S. Diego and I called this patch "las vegas" because we started to write it
> on the flight to Las Vegas for Lucene/Solr revolution.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]