[jira] [Commented] (SOLR-11769) Sorting performance degrades when useFilterForSortedQuery is enabled and there is no filter query specified

Yonik Seeley (JIRA) Wed, 20 Dec 2017 03:08:12 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-11769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298292#comment-16298292
 ]


Yonik Seeley commented on SOLR-11769:
-------------------------------------

We just need to be careful about returning cached sets where we did not before 
(and check that we never modify sets returned, as well as document that the 
returned set should not be modified).
For DocSetUtil.createDocSet specifically, it feels like potentially returning 
liveDocs should either be moved to a higher level caching function, or we 
should rename createDocSet since it can now sometimes just return an existing 
shared set.

> Sorting performance degrades when useFilterForSortedQuery is enabled and 
> there is no filter query specified
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-11769
>                 URL: https://issues.apache.org/jira/browse/SOLR-11769
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: search
>    Affects Versions: 4.10.4
>         Environment: OS: macOS Sierra (version 10.12.4)
> Memory: 16GB
> CPU: 2.9 GHz Intel Core i7
> Java Version: 1.8
>            Reporter: Betim Deva
>            Assignee: David Smiley
>              Labels: performance
>         Attachments: SOLR-11769_Optimize_MatchAllDocsQuery_more.patch
>
>
> The performance of sorting degrades significantly when the 
> {{useFilterForSortedQuery}} is enabled, and there's no filter query specified.
> *Steps to Reproduce:*
> 1. Set {{useFilterForSortedQuery=true}} in {{solrconfig.xml}}
> 2. Run a  query to match and return a single document. Also add sorting
> - Example {{/select?q=foo:123&sort=bar+desc}}
> Having a large index (> 10 million documents), this yields to a slow response 
> (a few hundreds of milliseconds on average) even when the resulting set 
> consists of a single document.
> *Observation 1:*
> - Disabling {{useFilterForSortedQuery}} improves the performance to < 1ms
> *Observation 2:*
> - Removing the {{sort}} improves the performance to < 1ms
> *Observation 3:*
> - Keeping the {{sort}}, and adding any filter query (such as {{fq=\*:\*}}) 
> improves the performance to < 1 ms.
> After profiling 
> [SolrIndexSearcher.java|https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=blob;f=solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java;h=9ee5199bdf7511c70f2cc616c123292c97d36b5b;hb=HEAD#l1400]
>  found that the bottleneck is on 
> {{DocSet bigFilt = getDocSet(cmd.getFilterList());}} 
> when {{cmd.getFilterList())}} is passed in as {{null}}. This is making 
> {{getDocSet()}} function collect document ids every single time it is called 
> without any caching.
> {code:java}
> 1394     if (useFilterCache) {
> 1395       // now actually use the filter cache.
> 1396       // for large filters that match few documents, this may be
> 1397       // slower than simply re-executing the query.
> 1398       if (out.docSet == null) {
> 1399         out.docSet = getDocSet(cmd.getQuery(), cmd.getFilter());
> 1400         DocSet bigFilt = getDocSet(cmd.getFilterList());
> 1401         if (bigFilt != null) out.docSet = 
> out.docSet.intersection(bigFilt);
> 1402       }
> 1403       // todo: there could be a sortDocSet that could take a list of
> 1404       // the filters instead of anding them first...
> 1405       // perhaps there should be a multi-docset-iterator
> 1406       sortDocSet(qr, cmd);
> 1407     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11769) Sorting performance degrades when useFilterForSortedQuery is enabled and there is no filter query specified

Reply via email to