[jira] [Commented] (SOLR-10727) Optimize faceting on empty docSet to return faster and not pollute filter cache

Hoss Man (JIRA) Mon, 22 May 2017 14:43:15 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-10727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020247#comment-16020247
 ]


Hoss Man commented on SOLR-10727:
---------------------------------

+1 ... but it seems like if we're going to try and optimize this situation, why 
not (also) optimize it slightly higher up and completely avoid the construction 
of the Query objects? (and in some cases: additional overhead)

for example: the first usage of {{SolrIndexSearcher.numDocs(Query,DocSet)}} i 
found was {{RangeFacetProcessor.rangeCount(DocSet subset,...)}} ... if the 
first line of that method was {{if (0 == subset.size()) return 0}} then we'd 
not only optimize away the SolrIndexSearcher hit, but also fetching the 
SchemaField & building the range query (not to mention the much more expensive 
{{getGroupedFacetQueryCount}} in the grouping case)

At a glance, most other callers of {{SolrIndexSearcher.numDocs(Query,DocSet)}} 
could be trivially optimize this way as well -- at a minimum to eliminate Query 
parsing/construction.


> Optimize faceting on empty docSet to return faster and not pollute filter 
> cache
> -------------------------------------------------------------------------------
>
>                 Key: SOLR-10727
>                 URL: https://issues.apache.org/jira/browse/SOLR-10727
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Minor
>             Fix For: master (7.0)
>
>         Attachments: SOLR_10727_SolrIndexSearcher_numDocs_0_docSet.patch
>
>
> In certain faceting circumstances (both classic SimpleFacets and JSON 
> Facets), {{SolrIndexSearcher.numDocs(q,docSet)}} is invoked.  numDocs can be 
> improved to return early if the passed docSet (the base of documents to 
> compute facets on) is empty.  Since it doesn't today, it'll go create a 
> filter cache entry (if it doesn't exist) for the query.  Range faceting is a 
> heavy user of this method since it'll be called for each range.  If you're 
> doing date range faceting and do time based sharding and have a time based 
> filter query, then there's a decent chance the current index in question 
> won't match the query.  So lets not pollute the filter cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-10727) Optimize faceting on empty docSet to return faster and not pollute filter cache

Reply via email to