Malvina Josephidou created SOLR-11831:
-----------------------------------------

             Summary:  Skip second grouping step if group.limit is 1 (aka Las 
Vegas patch)
                 Key: SOLR-11831
                 URL: https://issues.apache.org/jira/browse/SOLR-11831
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Malvina Josephidou
            Priority: Minor


In cases where we do grouping and ask for  {{group.limit=1}} only it is 
possible to skip the second grouping step. In our test datasets it improved 
speed by around 40%.

Essentially, in the first grouping step each shard returns the top K groups 
based on the highest scoring document in each group. The top K groups from each 
shard are merged in the federator and in the second step we ask all the shards 
to return the top documents from each of the top ranking groups.

If we only want to return the highest scoring document per group we can return 
the top document id in the first step, merge results in the federator to retain 
the top K groups and then skip the second grouping step entirely. This is 
possible provided that:

a) We do not need to know the total number of matching documents per group
b) Within group sort and between group sort is the same. 
c) We are not doing reranking (this is because this is done in the second 
grouping step. It is also possible to get this to work with reranking but more 
work and some additional assumptions are required)
 
This patch applies the grouping optimisation in cases where a)-c) apply and we 
are only sorting by relevance. It is also possible to extend this work to 
handle multiple sorting criteria and also reranking. 

P.S. Diego and I called this patch "las vegas" because we started to write it 
on the flight to Las Vegas for Lucene/Solr revolution. 





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to