Right, if you facet results, then your warmup queries should include those 
facets.  The same with sorting.  If you sort on fields A and B, then include 
warmup queries that sort on A and B.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Demian Katz <demian.k...@villanova.edu>
> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> Sent: Fri, June 3, 2011 11:21:52 AM
> Subject: RE: Solr performance tuning - disk i/o?
> 
> Thanks to you and Otis for the suggestions!  Some more  information:
> 
> - Based on the Solr stats page, my caches seem to be working  pretty well 
> (few 
>or no evictions, hit rates in the 75-80% range).
> - VuFind is  actually doing two Solr queries per search (one initial search 
>followed by a  supplemental spell check search -- I believe this is necessary 
>because VuFind  has two separate spelling indexes, one for shingled terms and 
>one for single  words).  That is probably exaggerating the problem, though 
>based 
>on  searches with debugQuery on, it looks like it's always the initial search  
>(rather than the supplemental spelling search) that's consuming the bulk of 
>the  
>time.
> - enableLazyFieldLoading is set to true.
> - I'm retrieving 20  documents per page.
> - My JVM settings: -server  -Xloggc:/usr/local/vufind/solr/jetty/logs/gc.log 
>-Xms4096m -Xmx4096m  -XX:+UseParallelGC -XX:+UseParallelOldGC -XX:NewRatio=5
> 
> It appears that a  large portion of my problem had to do with autowarming, a 
>topic that I've never  had a strong grasp on, though perhaps I'm finally 
>learning (any recommended  primer links would be welcome!).  I did have some 
>autowarming settings in  solrconfig.xml (an arbitrary search for a bunch of 
>random keywords in the  newSearcher and firstSearcher events, plus 
>autowarmCount 
>settings on all of my  caches).  However, when I looked at the debugQuery 
>output, I noticed that a  huge amount of time was being wasted loading facets 
>on 
>the first search after  restarting Solr, so I changed my newSearcher and 
>firstSearcher events to  this:
> 
>       <arr name="queries">
>          <lst>
>           <str  name="q">*:*</str>
>           <str  name="start">0</str>
>           <str  name="rows">10</str>
>           <str  name="facet">true</str>
>           <str  name="facet.mincount">1</str>
>            <str name="facet.field">collection</str>
>            <str name="facet.field">format</str>
>            <str  name="facet.field">publishDate</str>
>            <str name="facet.field">callnumber-first</str>
>            <str  name="facet.field">topic_facet</str>
>            <str name="facet.field">authorStr</str>
>            <str  name="facet.field">language</str>
>            <str name="facet.field">genre_facet</str>
>            <str name="facet.field">era_facet</str>
>            <str  name="facet.field">geographic_facet</str>
>          </lst>
>       </arr>
> 
> Overall  performance has now increased dramatically, and now the biggest 
>bottleneck in  the debug output seems to be the shingle spell checking!
> 
> Any other  suggestions are welcome, since I suspect there's still room to 
>squeeze more  performance out of the system, and I'm still not sure I'm making 
>the most of  autowarming...  but this seems like a big step in the right  
>direction.  Thanks again for the help!
> 
> - Demian
> 
> >  -----Original Message-----
> > From: Erick Erickson [mailto:erickerick...@gmail.com]
> > Sent:  Friday, June 03, 2011 9:41 AM
> > To: solr-user@lucene.apache.org
> >  Subject: Re: Solr performance tuning - disk i/o?
> > 
> > This doesn't  seem right. Here's a couple of things to try:
> > 1> attach  &debugQuery=on to your long-running queries. The QTime
> >  returned
> >      is the time taken to search, NOT including  the time to load the
> > docs. That'll
> >      help  pinpoint whether the problem is the search itself, or
> > assembling  the
> >      documents.
> > 2> Are you autowarming? If  so, be sure it's actually done before
> > querying.
> > 3> Measure  queries after the first few, particularly if you're sorting
> >  or
> >      faceting.
> > 4> What are your JVM  settings? How much memory do you have?
> > 5> is  <enableLazyFieldLoading> set to true in your solrconfig.xml?
> > 6>  How many docs are you returning?
> > 
> > 
> > There's more, but  that'll do for a start.... Let us know if you gather
> > more data
> >  and it's still slow.
> > 
> > Best
> > Erick
> > 
> > On  Fri, Jun 3, 2011 at 8:44 AM, Demian Katz <demian.k...@villanova.edu>
> >  wrote:
> > > Hello,
> > >
> > > I'm trying to move a  VuFind installation from an ailing physical
> > server into a virtualized  environment, and I'm running into performance
> > problems.  VuFind is a  Solr 1.4.1-based application with fairly large
> > and complex records (many  stored fields, many words per record).  My
> > particular installation  contains about a million records in the index,
> > with a total index size  around 6GB.
> > >
> > > The virtual environment has more RAM and  better CPUs than the old
> > physical box, and I am satisfied that my Java  environment is well-
> > tuned.  My index is optimized.  Searches that hit  the cache respond
> > very well.  The problem is that non-cached searches  are very slow - the
> > more keywords I add, the slower they get, to the  point of taking 6-12
> > seconds to come back with results on a quiet box  and well over a minute
> > under stress testing.  (The old box still took a  while for equivalent
> > searches, but it was about twice as fast as the new  one).
> > >
> > > My gut feeling is that disk access reading the  index is the
> > bottleneck here, but I know little about the specifics of  Solr's
> > internals, so it's entirely possible that my gut is wrong.   Outside
> > testing does show that the the virtual environment's disk  performance
> > is not as good as the old physical server, especially when  multiple
> > processes are trying to access the same file  simultaneously.
> > >
> > > So, two basic questions:
> >  >
> > >
> > > 1.)    Would you agree that I'm dealing with a  disk bottleneck, or
> > are there some other factors I should be  considering?  Any good
> > diagnostics I should be looking at?
> >  >
> > > 2.)    If the problem is disk access, is there anything I can  tune on
> > the Solr side to alleviate the problems?
> > >
> >  > Thanks,
> > > Demian
> > >
> 

Reply via email to