[ 
https://issues.apache.org/jira/browse/SOLR-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885366#comment-13885366
 ] 

Jan Høydahl commented on SOLR-5444:
-----------------------------------

I'm not familiar with this part of the code, but the patch seems to fix a real 
problem for large installs. Anyone wants to have a look? [~ysee...@gmail.com]?

> Slow response on facet search, lots of facets, asking for few facets in 
> response
> --------------------------------------------------------------------------------
>
>                 Key: SOLR-5444
>                 URL: https://issues.apache.org/jira/browse/SOLR-5444
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>    Affects Versions: 4.4
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: docvalue, faceted-search, performance
>             Fix For: 4.7
>
>         Attachments: Profiiling_SimpleFacets_getListedTermCounts_path.png, 
> Profiling_SimpleFacets_getTermCounts_path.png, 
> Responsetime_func_of_facets_asked_for-Simple_DocSetCollector_fix.png, 
> Responsetime_func_of_facets_asked_for.png, 
> SOLR-5444_ExpandingIntArray_DocSetCollector_4_4_0.patch, 
> SOLR-5444_simple_DocSetCollector_4_4_0.patch
>
>
> h5. Setup
> We have a 6-Solr-node (release 4.4.0) setup with 12 billion "small" documents 
> loaded across 3 collections. The documents have the following fields
> * a_dlng_doc_sto (docvalue long)
> * b_dlng_doc_sto (docvalue long)
> * c_dstr_doc_sto (docvalue string)
> * timestamp_lng_ind_sto  (indexed long)
> * d_lng_ind_sto (indexed long)
> From schema.xml
> {code}
>     <dynamicField name="*_dstr_doc_sto" type="dstring" indexed="false" 
> stored="true" required="true" docValues="true"/>
>     <dynamicField name="*_lng_ind_sto" type="long" indexed="true" 
> stored="true"/>
>     <dynamicField name="*_dlng_doc_sto" type="dlng" indexed="false" 
> stored="true" required="true" docValues="true"/>
> ...
>     <fieldType name="dstring" class="solr.StrField" sortMissingLast="true" 
> docValuesFormat="Disk"/>
>     <fieldType name="dlng" class="solr.TrieLongField" precisionStep="0" 
> positionIncrementGap="0" docValuesFormat="Disk"/>
> {code}
> timestamp_lng_ind_sto decides which collection documents go into
> We execute queries on the following format:
> * q=timestamp_lng_ind_sto:\[x TO y\] AND d_lng_ind_sto:(a OR b OR ... OR n)
> * 
> facet=true&facet.field=a_dlng_doc_sto&facet.zeros=false&facet.mincount=1&facet.limit=<asked-for-facets>&rows=0&start=0
> h5. Problem 
> We see very slow response-time when hitting large number of rows, spanning 
> lots of facets, but only ask for "a few" of those facets
> h5. Concrete example of query to get some concrete numbers to look at
> With x and y plus a, b ... n set to values so that
> * The timestamp_lng_ind_sto:\[x TO y\] part of the search-criteria alone hit 
> about 1.7 billion documents (actually all in one (containing 4.5 billion 
> docs) of the three collections - but that is not important)
> * The d_lng_ind_sto:(a OR b OR ... OR n) part of the search-criteria alone 
> hit about 500000 documents
> * The combined search-criteria (timestamp_lng_ind_sto AND'ed with 
> d_lng_ind_sto) hit about 200000 documents
> The following graph shows responsetime as a function of <asked-for-facets> 
> (in query)
> !Responsetime_func_of_facets_asked_for.png!
> Note that responsetime is high for "low" <asked-for-facets>, and that it 
> increases fast (but linearly) in <asked-for-facets> up until 
> <asked-for-facets> is somewhere inbetween 5000 (where responsetime is close 
> to 1000 secs) and 10000 (where responsetime is about 5 secs). For values of 
> <asked-for-facets> above 10000 responsetime stays "low" at between 1-10 secs
> Looking at the code and profiling it is clear that the change to better 
> responsetime occurs when SimpleFacets.getFacetFieldCounts changes from using 
> getListedTermCounts to using getTermCounts.
> The following image shows profiling information during a request with 
> <asked-for-facets> at about 2000.
> !Profiiling_SimpleFacets_getListedTermCounts_path.png!
> Note that
> * SimpleFacets.getListedTermCounts is used (green box)
> * 91% of the time spent performing the query is spent in 
> DocSetCollector-constructor (red box). During this concrete query 125000 
> DocSetCollection-objects are created spending 710 secs all in all. Additional 
> investigations show that the time is spent allocating huge int-arrays for the 
> "scratch"-int-array. Several thousands of those DocSetCollection-constructors 
> create int-arrays at size above 1 million - that takes time, and also leaves 
> a nice little job of the GC'er afterwards.
> * The actual search-part of the query takes only 0.5% (4 secs) of the 
> combined time executing the query (blue box)
> The following image shows profiling information during a request with 
> <asked-for-facets> at about 10000
> !Profiling_SimpleFacets_getTermCounts_path.png!
> Note that
> * SimpleFacets.getTermCounts is used (green box)
> * The actual search-part of the query now takes 70% (11 secs) of the combined 
> time executing the query (blue box)
> h5. What to do about this?
> * I am not sure why there are two paths that SimpleFacets.getFacetFieldCounts 
> can take (getListedTermCounts or getTermCounts) - but I am pretty sure there 
> is a good reason. It seems like getListedTermCounts is used when 
> <asked-for-facets> is noticeable lower than the total number of facets hit 
> (believe it is when <asked-for-facets> * 1.5 + 10 is below actual number of 
> facets hit)
> * *One solution* could be to just drop the getListedTermCounts-path and 
> always go getTermCounts, but that is probably not at good idea, because 
> getListedTermCounts is probably there for a performance reason (in other 
> scenarios)
> * The comment above DocSetCollection.scratch says
> {code}
>   // in case there aren't that many hits, we may not want a very sparse
>   // bit array.  Optimistically collect the first few docs in an array
>   // in case there are only a few.
>   final int[] scratch;
> {code}
> The comment seems reasonable. But when we look at what values are used as 
> "smallSetSize" for the DocSetCollection-constructor, it is always "maxDoc >> 
> 6" (basically dividing by 64) - this value depends on maxDoc and will be high 
> if maxDoc is high. In my case maxDoc is 50+ million a lot of the times 
> resulting in "smallSetSize"s of 1+ million (that is not "a few"). I am very 
> much in doubt why you want "smallSetSize" to increase as maxDoc increase - 
> why not just always a low (fixed or something) value for "smallSetSize"? Is 
> it ever a good idea with huge int-arrays for the "scratch"-array?
> * *Another solution* would be to never create "scratch"-arrays with size 
> above e.g. 50
> * *There are probably several other potential solutions*
> I would really want your opinion on what solution to make, so that I do not 
> unintentionally break good performance-optimizations, just because I missed 
> some points explaining why the code is as it is today!?
> *Note* I have filed this as a 4.4 issues, because that is the platform I use 
> for my tests etc. But I am sure the problem also exists on 4.5.1 (or whatever 
> the latest 4.x release is)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to