[ https://issues.apache.org/jira/browse/SOLR-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885366#comment-13885366 ]
Jan Høydahl commented on SOLR-5444: ----------------------------------- I'm not familiar with this part of the code, but the patch seems to fix a real problem for large installs. Anyone wants to have a look? [~ysee...@gmail.com]? > Slow response on facet search, lots of facets, asking for few facets in > response > -------------------------------------------------------------------------------- > > Key: SOLR-5444 > URL: https://issues.apache.org/jira/browse/SOLR-5444 > Project: Solr > Issue Type: Improvement > Components: SolrCloud > Affects Versions: 4.4 > Reporter: Per Steffensen > Assignee: Per Steffensen > Labels: docvalue, faceted-search, performance > Fix For: 4.7 > > Attachments: Profiiling_SimpleFacets_getListedTermCounts_path.png, > Profiling_SimpleFacets_getTermCounts_path.png, > Responsetime_func_of_facets_asked_for-Simple_DocSetCollector_fix.png, > Responsetime_func_of_facets_asked_for.png, > SOLR-5444_ExpandingIntArray_DocSetCollector_4_4_0.patch, > SOLR-5444_simple_DocSetCollector_4_4_0.patch > > > h5. Setup > We have a 6-Solr-node (release 4.4.0) setup with 12 billion "small" documents > loaded across 3 collections. The documents have the following fields > * a_dlng_doc_sto (docvalue long) > * b_dlng_doc_sto (docvalue long) > * c_dstr_doc_sto (docvalue string) > * timestamp_lng_ind_sto (indexed long) > * d_lng_ind_sto (indexed long) > From schema.xml > {code} > <dynamicField name="*_dstr_doc_sto" type="dstring" indexed="false" > stored="true" required="true" docValues="true"/> > <dynamicField name="*_lng_ind_sto" type="long" indexed="true" > stored="true"/> > <dynamicField name="*_dlng_doc_sto" type="dlng" indexed="false" > stored="true" required="true" docValues="true"/> > ... > <fieldType name="dstring" class="solr.StrField" sortMissingLast="true" > docValuesFormat="Disk"/> > <fieldType name="dlng" class="solr.TrieLongField" precisionStep="0" > positionIncrementGap="0" docValuesFormat="Disk"/> > {code} > timestamp_lng_ind_sto decides which collection documents go into > We execute queries on the following format: > * q=timestamp_lng_ind_sto:\[x TO y\] AND d_lng_ind_sto:(a OR b OR ... OR n) > * > facet=true&facet.field=a_dlng_doc_sto&facet.zeros=false&facet.mincount=1&facet.limit=<asked-for-facets>&rows=0&start=0 > h5. Problem > We see very slow response-time when hitting large number of rows, spanning > lots of facets, but only ask for "a few" of those facets > h5. Concrete example of query to get some concrete numbers to look at > With x and y plus a, b ... n set to values so that > * The timestamp_lng_ind_sto:\[x TO y\] part of the search-criteria alone hit > about 1.7 billion documents (actually all in one (containing 4.5 billion > docs) of the three collections - but that is not important) > * The d_lng_ind_sto:(a OR b OR ... OR n) part of the search-criteria alone > hit about 500000 documents > * The combined search-criteria (timestamp_lng_ind_sto AND'ed with > d_lng_ind_sto) hit about 200000 documents > The following graph shows responsetime as a function of <asked-for-facets> > (in query) > !Responsetime_func_of_facets_asked_for.png! > Note that responsetime is high for "low" <asked-for-facets>, and that it > increases fast (but linearly) in <asked-for-facets> up until > <asked-for-facets> is somewhere inbetween 5000 (where responsetime is close > to 1000 secs) and 10000 (where responsetime is about 5 secs). For values of > <asked-for-facets> above 10000 responsetime stays "low" at between 1-10 secs > Looking at the code and profiling it is clear that the change to better > responsetime occurs when SimpleFacets.getFacetFieldCounts changes from using > getListedTermCounts to using getTermCounts. > The following image shows profiling information during a request with > <asked-for-facets> at about 2000. > !Profiiling_SimpleFacets_getListedTermCounts_path.png! > Note that > * SimpleFacets.getListedTermCounts is used (green box) > * 91% of the time spent performing the query is spent in > DocSetCollector-constructor (red box). During this concrete query 125000 > DocSetCollection-objects are created spending 710 secs all in all. Additional > investigations show that the time is spent allocating huge int-arrays for the > "scratch"-int-array. Several thousands of those DocSetCollection-constructors > create int-arrays at size above 1 million - that takes time, and also leaves > a nice little job of the GC'er afterwards. > * The actual search-part of the query takes only 0.5% (4 secs) of the > combined time executing the query (blue box) > The following image shows profiling information during a request with > <asked-for-facets> at about 10000 > !Profiling_SimpleFacets_getTermCounts_path.png! > Note that > * SimpleFacets.getTermCounts is used (green box) > * The actual search-part of the query now takes 70% (11 secs) of the combined > time executing the query (blue box) > h5. What to do about this? > * I am not sure why there are two paths that SimpleFacets.getFacetFieldCounts > can take (getListedTermCounts or getTermCounts) - but I am pretty sure there > is a good reason. It seems like getListedTermCounts is used when > <asked-for-facets> is noticeable lower than the total number of facets hit > (believe it is when <asked-for-facets> * 1.5 + 10 is below actual number of > facets hit) > * *One solution* could be to just drop the getListedTermCounts-path and > always go getTermCounts, but that is probably not at good idea, because > getListedTermCounts is probably there for a performance reason (in other > scenarios) > * The comment above DocSetCollection.scratch says > {code} > // in case there aren't that many hits, we may not want a very sparse > // bit array. Optimistically collect the first few docs in an array > // in case there are only a few. > final int[] scratch; > {code} > The comment seems reasonable. But when we look at what values are used as > "smallSetSize" for the DocSetCollection-constructor, it is always "maxDoc >> > 6" (basically dividing by 64) - this value depends on maxDoc and will be high > if maxDoc is high. In my case maxDoc is 50+ million a lot of the times > resulting in "smallSetSize"s of 1+ million (that is not "a few"). I am very > much in doubt why you want "smallSetSize" to increase as maxDoc increase - > why not just always a low (fixed or something) value for "smallSetSize"? Is > it ever a good idea with huge int-arrays for the "scratch"-array? > * *Another solution* would be to never create "scratch"-arrays with size > above e.g. 50 > * *There are probably several other potential solutions* > I would really want your opinion on what solution to make, so that I do not > unintentionally break good performance-optimizations, just because I missed > some points explaining why the code is as it is today!? > *Note* I have filed this as a 4.4 issues, because that is the platform I use > for my tests etc. But I am sure the problem also exists on 4.5.1 (or whatever > the latest 4.x release is) -- This message was sent by Atlassian JIRA (v6.1.5#6160) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org