[ https://issues.apache.org/jira/browse/SOLR-11769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16380721#comment-16380721 ]
ASF subversion and git services commented on SOLR-11769: -------------------------------------------------------- Commit 9b3d68843beb5e0a834d6847446a480742665805 in lucene-solr's branch refs/heads/branch_7x from [~dsmiley] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9b3d688 ] SOLR-11769: optimize useFilterForSortedQuery=true when no filter queries (cherry picked from commit ef98912) > Sorting performance degrades when useFilterForSortedQuery is enabled and > there is no filter query specified > ----------------------------------------------------------------------------------------------------------- > > Key: SOLR-11769 > URL: https://issues.apache.org/jira/browse/SOLR-11769 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: search > Affects Versions: 4.10.4 > Environment: OS: macOS Sierra (version 10.12.4) > Memory: 16GB > CPU: 2.9 GHz Intel Core i7 > Java Version: 1.8 > Reporter: Betim Deva > Assignee: David Smiley > Priority: Major > Labels: performance > Attachments: SOLR-11769_Optimize_MatchAllDocsQuery_more.patch > > > The performance of sorting degrades significantly when the > {{useFilterForSortedQuery}} is enabled, and there's no filter query specified. > *Steps to Reproduce:* > 1. Set {{useFilterForSortedQuery=true}} in {{solrconfig.xml}} > 2. Run a query to match and return a single document. Also add sorting > - Example {{/select?q=foo:123&sort=bar+desc}} > Having a large index (> 10 million documents), this yields to a slow response > (a few hundreds of milliseconds on average) even when the resulting set > consists of a single document. > *Observation 1:* > - Disabling {{useFilterForSortedQuery}} improves the performance to < 1ms > *Observation 2:* > - Removing the {{sort}} improves the performance to < 1ms > *Observation 3:* > - Keeping the {{sort}}, and adding any filter query (such as {{fq=\*:\*}}) > improves the performance to < 1 ms. > After profiling > [SolrIndexSearcher.java|https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=blob;f=solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java;h=9ee5199bdf7511c70f2cc616c123292c97d36b5b;hb=HEAD#l1400] > found that the bottleneck is on > {{DocSet bigFilt = getDocSet(cmd.getFilterList());}} > when {{cmd.getFilterList())}} is passed in as {{null}}. This is making > {{getDocSet()}} function collect document ids every single time it is called > without any caching. > {code:java} > 1394 if (useFilterCache) { > 1395 // now actually use the filter cache. > 1396 // for large filters that match few documents, this may be > 1397 // slower than simply re-executing the query. > 1398 if (out.docSet == null) { > 1399 out.docSet = getDocSet(cmd.getQuery(), cmd.getFilter()); > 1400 DocSet bigFilt = getDocSet(cmd.getFilterList()); > 1401 if (bigFilt != null) out.docSet = > out.docSet.intersection(bigFilt); > 1402 } > 1403 // todo: there could be a sortDocSet that could take a list of > 1404 // the filters instead of anding them first... > 1405 // perhaps there should be a multi-docset-iterator > 1406 sortDocSet(qr, cmd); > 1407 } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org