Right, if you facet results, then your warmup queries should include those facets. The same with sorting. If you sort on fields A and B, then include warmup queries that sort on A and B.
Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Demian Katz <demian.k...@villanova.edu> > To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> > Sent: Fri, June 3, 2011 11:21:52 AM > Subject: RE: Solr performance tuning - disk i/o? > > Thanks to you and Otis for the suggestions! Some more information: > > - Based on the Solr stats page, my caches seem to be working pretty well > (few >or no evictions, hit rates in the 75-80% range). > - VuFind is actually doing two Solr queries per search (one initial search >followed by a supplemental spell check search -- I believe this is necessary >because VuFind has two separate spelling indexes, one for shingled terms and >one for single words). That is probably exaggerating the problem, though >based >on searches with debugQuery on, it looks like it's always the initial search >(rather than the supplemental spelling search) that's consuming the bulk of >the >time. > - enableLazyFieldLoading is set to true. > - I'm retrieving 20 documents per page. > - My JVM settings: -server -Xloggc:/usr/local/vufind/solr/jetty/logs/gc.log >-Xms4096m -Xmx4096m -XX:+UseParallelGC -XX:+UseParallelOldGC -XX:NewRatio=5 > > It appears that a large portion of my problem had to do with autowarming, a >topic that I've never had a strong grasp on, though perhaps I'm finally >learning (any recommended primer links would be welcome!). I did have some >autowarming settings in solrconfig.xml (an arbitrary search for a bunch of >random keywords in the newSearcher and firstSearcher events, plus >autowarmCount >settings on all of my caches). However, when I looked at the debugQuery >output, I noticed that a huge amount of time was being wasted loading facets >on >the first search after restarting Solr, so I changed my newSearcher and >firstSearcher events to this: > > <arr name="queries"> > <lst> > <str name="q">*:*</str> > <str name="start">0</str> > <str name="rows">10</str> > <str name="facet">true</str> > <str name="facet.mincount">1</str> > <str name="facet.field">collection</str> > <str name="facet.field">format</str> > <str name="facet.field">publishDate</str> > <str name="facet.field">callnumber-first</str> > <str name="facet.field">topic_facet</str> > <str name="facet.field">authorStr</str> > <str name="facet.field">language</str> > <str name="facet.field">genre_facet</str> > <str name="facet.field">era_facet</str> > <str name="facet.field">geographic_facet</str> > </lst> > </arr> > > Overall performance has now increased dramatically, and now the biggest >bottleneck in the debug output seems to be the shingle spell checking! > > Any other suggestions are welcome, since I suspect there's still room to >squeeze more performance out of the system, and I'm still not sure I'm making >the most of autowarming... but this seems like a big step in the right >direction. Thanks again for the help! > > - Demian > > > -----Original Message----- > > From: Erick Erickson [mailto:erickerick...@gmail.com] > > Sent: Friday, June 03, 2011 9:41 AM > > To: solr-user@lucene.apache.org > > Subject: Re: Solr performance tuning - disk i/o? > > > > This doesn't seem right. Here's a couple of things to try: > > 1> attach &debugQuery=on to your long-running queries. The QTime > > returned > > is the time taken to search, NOT including the time to load the > > docs. That'll > > help pinpoint whether the problem is the search itself, or > > assembling the > > documents. > > 2> Are you autowarming? If so, be sure it's actually done before > > querying. > > 3> Measure queries after the first few, particularly if you're sorting > > or > > faceting. > > 4> What are your JVM settings? How much memory do you have? > > 5> is <enableLazyFieldLoading> set to true in your solrconfig.xml? > > 6> How many docs are you returning? > > > > > > There's more, but that'll do for a start.... Let us know if you gather > > more data > > and it's still slow. > > > > Best > > Erick > > > > On Fri, Jun 3, 2011 at 8:44 AM, Demian Katz <demian.k...@villanova.edu> > > wrote: > > > Hello, > > > > > > I'm trying to move a VuFind installation from an ailing physical > > server into a virtualized environment, and I'm running into performance > > problems. VuFind is a Solr 1.4.1-based application with fairly large > > and complex records (many stored fields, many words per record). My > > particular installation contains about a million records in the index, > > with a total index size around 6GB. > > > > > > The virtual environment has more RAM and better CPUs than the old > > physical box, and I am satisfied that my Java environment is well- > > tuned. My index is optimized. Searches that hit the cache respond > > very well. The problem is that non-cached searches are very slow - the > > more keywords I add, the slower they get, to the point of taking 6-12 > > seconds to come back with results on a quiet box and well over a minute > > under stress testing. (The old box still took a while for equivalent > > searches, but it was about twice as fast as the new one). > > > > > > My gut feeling is that disk access reading the index is the > > bottleneck here, but I know little about the specifics of Solr's > > internals, so it's entirely possible that my gut is wrong. Outside > > testing does show that the the virtual environment's disk performance > > is not as good as the old physical server, especially when multiple > > processes are trying to access the same file simultaneously. > > > > > > So, two basic questions: > > > > > > > > > 1.) Would you agree that I'm dealing with a disk bottleneck, or > > are there some other factors I should be considering? Any good > > diagnostics I should be looking at? > > > > > > 2.) If the problem is disk access, is there anything I can tune on > > the Solr side to alleviate the problems? > > > > > > Thanks, > > > Demian > > > >