Erick, Thanks very much for your help so far with this one. I have captured the logs from a commit which shows a commit and new searcher starting. It appears a few ERROR's are amongst the logs and a few uninverting lines. The query is a very basic query as shown below: <lst> <str name="fq">DataType_s:Product</str> <str name="fq">WebSections_ms:house</str> <str name="fq">VisibleOnline_ms:NAT</str> <str name="q">SS_Stage_ms:Live</str> <str name="start">0</str> <str name="rows">20</str> </lst>
INFO (qtp755840090-55) [ x:core1] o.a.s.c.SolrCore SolrDeletionPolicy.onCommit: commits: num=2 commit{dir=NRTCachingDirectory(MMapDirectory@/var/solr/data/core1/data/index.20131212092900012 lockFactory=org.apache.lucene.store.NativeFSLockFactory@8815140; maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_9kgw,generation=446432} commit{dir=NRTCachingDirectory(MMapDirectory@/var/solr/data/core1/data/index.20131212092900012 lockFactory=org.apache.lucene.store.NativeFSLockFactory@8815140; maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_9kgx,generation=446433} INFO (qtp755840090-55) [ x:core1] o.a.s.c.SolrCore newest commit generation = 446433 INFO (qtp755840090-55) [ x:core1] o.a.s.s.SolrIndexSearcher Opening Searcher@63f1fac[core1] main INFO (searcherExecutor-7-thread-1-processing-x:core1) [ x:core1] o.a.s.c.SolrCore QuerySenderListener sending requests to Searcher@63f1fac[core1] main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_a3p4(5.3.1):C313761/58754:delGen=2073) Uninverting(_a7bs(5.3.1):c22601/9516:delGen=608) Uninverting(_a7np(5.3.1):C37794/14987:delGen=463) Uninverting(_aa7h(5.3.1):c13906/163:delGen=57) Uninverting(_a7u7(5.3.1):c5968/504:delGen=89) Uninverting(_a7yt(5.3.1):c2643/426:delGen=39) Uninverting(_aafw(5.3.1):c313/27:delGen=2) Uninverting(_aajf(5.3.1):c355/14:delGen=3) Uninverting(_aaqa(5.3.1):c195/1:delGen=1) Uninverting(_aapg(5.3.1):c279/3:delGen=3) Uninverting(_aahr(5.3.1):c262/5:delGen=1) Uninverting(_aafa(5.3.1):c265/2:delGen=1) Uninverting(_aap3(5.3.1):c252/2:delGen=2) Uninverting(_aaqb(5.3.1):C1) Uninverting(_aaqd(5.3.1):C1) Uninverting(_aaqh(5.3.1):C1) Uninverting(_aaqj(5.3.1):C1) Uninverting(_aaqm(5.3.1):C2/1:delGen=1) Uninverting(_aaqo(5.3.1):C1) Uninverting(_aaqq(5.3.1):C1) Uninverting(_aaqs(5.3.1):C1)))} ERROR (searcherExecutor-7-thread-1-processing-x:core1) [ x:core1] o.a.s.c.SolrCore Previous SolrRequestInfo was not closed! req=wt=json ERROR (searcherExecutor-7-thread-1-processing-x:core1) [ x:core1] o.a.s.c.SolrCore prev == info : false INFO (searcherExecutor-7-thread-1-processing-x:core1) [ x:core1] o.a.s.c.S.Request [core1] webapp=null path=null params={start=0&event=newSearcher&q=SS_Stage_ms:Live&distrib=false&fq=DataType_s:Product&fq=WebSections_ms:house&fq=VisibleOnline_ms:NAT&rows=20} hits=2541 status=0 QTime=18 INFO (searcherExecutor-7-thread-1-processing-x:core1) [ x:core1] o.a.s.c.SolrCore QuerySenderListener done. INFO (searcherExecutor-7-thread-1-processing-x:core1) [ x:core1] o.a.s.c.SolrCore [core1] Registered new searcher Searcher@63f1fac[core1] main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_a3p4(5.3.1):C313761/58754:delGen=2073) Uninverting(_a7bs(5.3.1):c22601/9516:delGen=608) Uninverting(_a7np(5.3.1):C37794/14987:delGen=463) Uninverting(_aa7h(5.3.1):c13906/163:delGen=57) Uninverting(_a7u7(5.3.1):c5968/504:delGen=89) Uninverting(_a7yt(5.3.1):c2643/426:delGen=39) Uninverting(_aafw(5.3.1):c313/27:delGen=2) Uninverting(_aajf(5.3.1):c355/14:delGen=3) Uninverting(_aaqa(5.3.1):c195/1:delGen=1) Uninverting(_aapg(5.3.1):c279/3:delGen=3) Uninverting(_aahr(5.3.1):c262/5:delGen=1) Uninverting(_aafa(5.3.1):c265/2:delGen=1) Uninverting(_aap3(5.3.1):c252/2:delGen=2) Uninverting(_aaqb(5.3.1):C1) Uninverting(_aaqd(5.3.1):C1) Uninverting(_aaqh(5.3.1):C1) Uninverting(_aaqj(5.3.1):C1) Uninverting(_aaqm(5.3.1):C2/1:delGen=1) Uninverting(_aaqo(5.3.1):C1) Uninverting(_aaqq(5.3.1):C1) Uninverting(_aaqs(5.3.1):C1)))} INFO (qtp755840090-55) [ x:core1] o.a.s.u.UpdateHandler end_commit_flush On Tuesday, 18 October 2016, 0:49, Erick Erickson <erickerick...@gmail.com> wrote: Wow, wouldn't it be useful if the name of the field was dumped in the message ;)... You should see a query happen just before that message in the log file. It won't quite be in the same format as a URL, but it's reasonably easy to figure out. Uninverting happens as a result of > sorting > faceting > grouping > ??? So the crude approach wold be to find the query(s) that precede this then break it apart submitting one of the above operations from the query in question at a time while tailing the logs. And by "after the warming queries", I'm assuming that the searcher has successfully opened (there'll be a message in the log). BTW, DocValues fields are strongly recommended for any field that gets uninverted. Best, Erick On Sun, Oct 16, 2016 at 10:28 PM, Dalton Gooding <daltonwestco...@yahoo.com.au> wrote: > Erick, > > I think you might have nailed it. > > After the three warming queries, I get the below when registering a new > searcher. How can I find out what I am missing in my warming queries? > > Registered new searcher Searcher@15cdc368[core1] > main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_a3p4(5.3.1):C313761/57815:delGen=1722) > Uninverting(_a7bs(5.3.1):c22601/8393:delGen=464) > Uninverting(_a7np(5.3.1):C37794/13471:delGen=244) > Uninverting(_a7rp(5.3.1):c2490/760:delGen=77) > Uninverting(_a7u7(5.3.1):c5968/338:delGen=27) > Uninverting(_a7rz(5.3.1):c4306/1718:delGen=94) > Uninverting(_a806(5.3.1):c1734/7:delGen=6) > Uninverting(_a7tx(5.3.1):c1900/67:delGen=10) > Uninverting(_a7tn(5.3.1):c1900/24:delGen=9) > Uninverting(_a7yt(5.3.1):c2643/144:delGen=8) > Uninverting(_a7zn(5.3.1):c1595/2:delGen=2) Uninverting(_a82v(5.3.1):c31) > Uninverting(_a843(5.3.1):c23/3:delGen=1) > Uninverting(_a844(5.3.1):C3/1:delGen=1) Uninverting(_a846(5.3.1):C1) > Uninverting(_a848(5.3.1):C5/1:delGen=1) Uninverting(_a84a(5.3.1):C1) > Uninverting(_a84c(5.3.1):C1)))} > > > On Friday, 14 October 2016, 9:50, Erick Erickson <erickerick...@gmail.com> > wrote: > > > Well, that's a puzzle, sorry I don't have a good answer. > > The fact that your newSearcher event to be fired _should_ populate > the low-level caches. BTW, what's your autowarm count on filterCache and > queryResultCache? Have you tried setting them to non-zero values? Those > won't help when you first start Solr, but that should be rare. > > Do note that your newSearcher events should mention a series of different > values > from the fields in question. > > Actually here's what I'd do. Look carefully in your Solr log after you > do a commit and > particularly when you fire your first query after the new searcher is > opened. Do you see > any suspicious messages like "Uninverting field" or the like? That > would indicate sorting > or faceting on a field that wasn't warmed by the newSearcher because > you didn't mention > sorting or faceting or whatever in your warming query. > > Can we see the newSearcher entry and the query you say is "similar" > but slow. My _guess_ > is that some innocent-seeming difference between the two is the issue. > > Best, > Erick > > On Sun, Oct 9, 2016 at 7:08 PM, Dalton Gooding > <daltonwestco...@yahoo.com.au> wrote: >> Erick, >> >> I have tried tuning the queries with some limited success. I still get >> drastic differences between the first time I fire my warming query (after >> newSearcher ran query) and the second time, or any variant of the query >> i.e. >> removing fields or changing parameters, it runs much faster. >> >> I am not sure what I am missing here, I put a query into the newSearcher >> section that runs fine, but the exact same query run after warming still >> takes the full time of a un-warmed query. >> >> Can you break it down to the most basic type of newSearcher query to try >> and >> shrink the gap between first query and subsequent queries sent? >> >> I cannot see why sending the same query after a newSearcher is slow, when >> subsequent queries run faster. I figured this was the idea of the >> newSearcher stanza's. >> >> >> On Friday, 7 October 2016, 14:45, Erick Erickson <erickerick...@gmail.com> >> wrote: >> >> >> Replying on the public thread, somehow your mail was sent to me privately. >> >> Pasted your email to me below for others. >> >> You are still confusing documents and results. Forget about the rows >> parameter, for this discussion it's irrelevant. >> >> The QTime is the time spent searching. It is unaffected by whether a >> document is in the documentCache or not. >> It _solely_ measures the time that Solr/Lucene take to find the top N >> documents (where N is the rows param) and >> record their internal Lucene doc ID. >> >> Increasing the rows or the document cache won't change anything about >> the QTime. The documentCache is >> totally the wrong place to focus. >> >> >> The response when you re-submit the query suggests that getting the >> top N docs' internal Lucene ID is >> fetched from the queryResultCache. Changing the window size is also >> irrelevant to this discussion. If you >> vary the query even slightly you won't hit the queryResultCache. A >> very easy way to check this is the >> admin UI>>select core>>plugins/stats>>QueryHandler and then probably >> the "select" handler. If you see >> the hits go up after the fast query then you're getting the results >> from the querResultCache. >> >> What _is_ relevant is populating the low-level Lucene caches with >> values from the indexed terms. My >> contention is that this is not happening with match-all queries, i.e. >> field:* or field:[* TO *] because in >> those cases, a doc matches or doesn't based on whether it has anything >> in the field. There's no point >> in finding values since it doesn't matter anyway. And "finding values" >> means reading indexed terms >> from disk into low-level Lucene caches. >> >> When I say "populate the low-level Lucene caches", what I'm really >> talking about is reading them from >> disk into your physical memory via MMapDirectory, see Uwe's excellent >> blog: >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html >> >> So the suggestion is that you use real values from your index or >> possibly ranges is so that part or all >> of your disk files get read into MMapDirectorySpace via the first or >> new Searcher event. >> >> Please just give it a try. My bet is that you'll see your QTime values >> first time after autowarming >> go down. Significantly. Be sure to use a wide variety of different >> values for autowarming. >> >> BTW, the autowarmCounts in solrconfig.xml filterCache and >> queryResultCache are intended >> to warm by using the last N fq or q clauses on the theory that the >> most recent N are predictive >> of the next N. >> >> Best, >> Erick >> >> >> *************** >> >> I believe the return time back to the command line from the curl >> command and the QTime as shown below >> >> time curl -v >> >> 'http://<SolR>/solr/core1/select?fq=DataType_s%3AProduct&fq=WebSections_ms%3Ahouse&fq=%28VisibleOnline_ms%3ANAT+OR+VisibleOnline_ms%3A7%29&fq=%7B%21tag%3Dcurrent_group%7DGroupIds_ms%3A458&sort=SalesRank_f+desc&facet=true&facet.field=%7B%21ex%3Dcurrent_group%7Dattr_GroupLevel0&facet.field=BrandID_s&facet.field=%7B%21ex%3Dcurrent_group%7Dattr_GroupLevel2&facet.field=%7B%21ex%3Dcurrent_group%7Dattr_GroupLevel1&facet.field=SubBrandID_s&facet.field=ProductAttr_967_ms&facet.field=ProductAttr_NEG21_ms&facet.field=ProductAttr_1852_ms&facet.query=Price_7_f%3A%5B%2A+TO+%2A%5D&facet.query=Price_2_f%3A%5B%2A+TO+%2A%5D&facet.query=Price_3_f%3A%5B%2A+TO+%2A%5D&facet.query=Price_4_f%3A%5B%2A+TO+%2A%5D&facet.query=Price_5_f%3A%5B%2A+TO+%2A%5D&facet.query=Price_6_f%3A%5B%2A+TO+%2A%5D&facet.mincount=1&wt=json&json.nl=map&q=%28title%3A%2A+OR+text%3A%2A%29+AND+%28ms%3ALive%29&start=0&rows=24' >> >> real 0m1.436s >> user 0m0.001s >> sys 0m0.006s >> >> "QTime":1387 >> >> From what you suggested, changing the rows value from 20 to something >> greater should add more documents to the cache. Injunction with tuning >> the queries to remove the * wild card, this should provide a better >> warming query? >> >> Should I also increase the queryResultWindowSize in the solrconfig.xml >> to help built out the cache? >> >> Cheers, >> >> Guy >> >> >> >> >> >> On Thu, Oct 6, 2016 at 4:43 PM, Dalton Gooding >> <daltonwestco...@yahoo.com.au> wrote: >>> Erick, >>> >>> Thanks for the response. After I run the initial query and get a long >>> response time, if I change the query to remove or add additional query >>> statements, I find the speed is good. >>> >>> If I run the modified query after a new searcher has registered, the >>> response is slow but after the modified query has been completed, the >>> warming query sent from CuRl is much faster. I assume it is because the >>> document cache has updated with the documents from the modified query. A >>> large number of our queries work with the same document set, I am trying >>> to >>> get a warming query to populate the document cache to be as big as >>> feasible. >>> >>> Should the firstSearcher and newSearcher warm the document cache? >>> >>> >>> On Friday, 7 October 2016, 9:31, Erick Erickson <erickerick...@gmail.com> >>> wrote: >>> >>> >>> Submitting the exact same query twice will return results from the >>> queryResultCache. I'm not entirely >>> sure that the firstSearcher events get put into the cache. >>> >>> So if you change the query even slighty my guess is that you'll see >>> response times very close to your >>> original ones of over a second. >>> >>> Best, >>> Erick >>> >>> On Thu, Oct 6, 2016 at 2:56 PM, Dalton Gooding >>> <daltonwestco...@yahoo.com.au.invalid> wrote: >>>> After setting a number of newSearcher and firstSearcher queries, I can >>>> see >>>> in the console logs that the queries are run, but when I run the same >>>> query >>>> against the new searcher (using CuRL), I get a slow response time for >>>> the >>>> first run. >>>> >>>> Config: >>>> <listener event="newSearcher" class="solr.QuerySenderListener"> >>>> <arr name="queries"> <lst> <str >>>> name="fq">DataType_s:Product</str> >>>> <str name="fq">WebSections_ms:house</str> <str >>>> name="fq">{!tag=current_group}GroupIds_ms:*</str> >>>> <str name="facet">true</str> <str >>>> name="facet.field">BrandID_s</str> <str >>>> name="facet.query">Price_2_f:[* TO *]</str> <str >>>> name="facet.query">Price_3_f:[* TO *]</str> <str >>>> name="facet.query">Price_4_f:[* TO *]</str> <str >>>> name="facet.query">Price_5_f:[* TO *]</str> <str >>>> name="facet.query">Price_6_f:[* TO *]</str> <str >>>> name="facet.query">Price_7_f:[* TO *]</str> <str >>>> name="facet.query">Price_8_f:[* TO *]</str> <str >>>> name="facet.mincount">1</str> <str >>>> name="facet.method">fc</str> >>>> <str name="wt">json</str> <str name="json.nl">map</str> >>>> <str name="q">(title:* OR text:*)</str> <str >>>> name="start">0</str> <str name="rows">20</str> >>>> </lst> >>>> </arr> </listener> >>>> >>>> Console log: >>>> INFO (searcherExecutor-7-thread-1-processing-x:core1) [ x:core1] >>>> o.a.s.c.S.Request [core1] webapp=null path=null >>>> >>>> >>>> params={facet=true&facet.mincount=1&start=0&facet.query=Price_2_f:[*+TO+*]&facet.query=Price_3_f:[*+TO+*]&facet.query=Price_4_f:[*+TO+*]&facet.query=Price_5_f:[*+TO+*]&facet.query=Price_6_f:[*+TO+*]&facet.query=Price_7_f:[*+TO+*]&facet.query=Price_8_f:[*+TO+*]&event=newSearcher&q=(title:*+OR+text:*)&distrib=false&json.nl=map&facet.field=BrandID_s&wt=json&facet.method=fc&fq=DataType_s:Product&fq=WebSections_ms:house&fq=VisibleOnline_ms:7&fq={!tag%3Dcurrent_group}GroupIds_ms:*&rows=20} >>>> hits=2549 status=0 QTime=1263 >>>> >>>> >>>> If I run the same query after the index has registered I see a QTime of >>>> over a second, the second time I run the query I see around 80ms. This >>>> leads >>>> me to believe the warming did not occur or the query was not commited to >>>> cache on start up of the new searcher. >>>> Can someone please advise on how to use the newSearcher queries to >>>> effectively warm SolR caches. Should I see an improved response for the >>>> first time I run the query if the same query has been used as a >>>> newSearcher >>>> query? >>>> Cheers, >>>> Dalton >>> >>> >> >> On Thu, Oct 6, 2016 at 4:43 PM, Dalton Gooding >> <daltonwestco...@yahoo.com.au> wrote: >>> Erick, >>> >>> Thanks for the response. After I run the initial query and get a long >>> response time, if I change the query to remove or add additional query >>> statements, I find the speed is good. >>> >>> If I run the modified query after a new searcher has registered, the >>> response is slow but after the modified query has been completed, the >>> warming query sent from CuRl is much faster. I assume it is because the >>> document cache has updated with the documents from the modified query. A >>> large number of our queries work with the same document set, I am trying >>> to >>> get a warming query to populate the document cache to be as big as >>> feasible. >>> >>> Should the firstSearcher and newSearcher warm the document cache? >>> >>> >>> On Friday, 7 October 2016, 9:31, Erick Erickson <erickerick...@gmail.com> >>> wrote: >>> >>> >>> Submitting the exact same query twice will return results from the >>> queryResultCache. I'm not entirely >>> sure that the firstSearcher events get put into the cache. >>> >>> So if you change the query even slighty my guess is that you'll see >>> response times very close to your >>> original ones of over a second. >>> >>> Best, >>> Erick >>> >>> >>> On Thu, Oct 6, 2016 at 2:56 PM, Dalton Gooding >>> <daltonwestco...@yahoo.com.au.invalid> wrote: >>>> After setting a number of newSearcher and firstSearcher queries, I can >>>> see >>>> in the console logs that the queries are run, but when I run the same >>>> query >>>> against the new searcher (using CuRL), I get a slow response time for >>>> the >>>> first run. >>>> >>>> Config: >>>> <listener event="newSearcher" class="solr.QuerySenderListener"> >>>> <arr name="queries"> <lst> <str >>>> name="fq">DataType_s:Product</str> >>>> <str name="fq">WebSections_ms:house</str> <str >>>> name="fq">{!tag=current_group}GroupIds_ms:*</str> >>>> <str name="facet">true</str> <str >>>> name="facet.field">BrandID_s</str> <str >>>> name="facet.query">Price_2_f:[* TO *]</str> <str >>>> name="facet.query">Price_3_f:[* TO *]</str> <str >>>> name="facet.query">Price_4_f:[* TO *]</str> <str >>>> name="facet.query">Price_5_f:[* TO *]</str> <str >>>> name="facet.query">Price_6_f:[* TO *]</str> <str >>>> name="facet.query">Price_7_f:[* TO *]</str> <str >>>> name="facet.query">Price_8_f:[* TO *]</str> <str >>>> name="facet.mincount">1</str> <str >>>> name="facet.method">fc</str> >>>> <str name="wt">json</str> <str name="json.nl">map</str> >>>> <str name="q">(title:* OR text:*)</str> <str >>>> name="start">0</str> <str name="rows">20</str> >>>> </lst> >>>> </arr> </listener> >>>> >>>> Console log: >>>> INFO (searcherExecutor-7-thread-1-processing-x:core1) [ x:core1] >>>> o.a.s.c.S.Request [core1] webapp=null path=null >>>> >>>> >>>> params={facet=true&facet.mincount=1&start=0&facet.query=Price_2_f:[*+TO+*]&facet.query=Price_3_f:[*+TO+*]&facet.query=Price_4_f:[*+TO+*]&facet.query=Price_5_f:[*+TO+*]&facet.query=Price_6_f:[*+TO+*]&facet.query=Price_7_f:[*+TO+*]&facet.query=Price_8_f:[*+TO+*]&event=newSearcher&q=(title:*+OR+text:*)&distrib=false&json.nl=map&facet.field=BrandID_s&wt=json&facet.method=fc&fq=DataType_s:Product&fq=WebSections_ms:house&fq=VisibleOnline_ms:7&fq={!tag%3Dcurrent_group}GroupIds_ms:*&rows=20} >>>> hits=2549 status=0 QTime=1263 >>>> >>>> >>>> If I run the same query after the index has registered I see a QTime of >>>> over a second, the second time I run the query I see around 80ms. This >>>> leads >>>> me to believe the warming did not occur or the query was not commited to >>>> cache on start up of the new searcher. >>>> Can someone please advise on how to use the newSearcher queries to >>>> effectively warm SolR caches. Should I see an improved response for the >>>> first time I run the query if the same query has been used as a >>>> newSearcher >>>> query? >>>> Cheers, >>>> Dalton >>> >>> >> >> > >