Otis, The reason I ask is that I run a number of sites on Solr, some with 10 million+ docs faceting on similar types of data, and have not seen anywhere near this length of initial delay. The main difference is that these sites facet on single value fields rather that multivalued and that this site is searching on 3 times the volume of data. Would switching to single valued (I'd rather not) make much of a difference.
I've also noticed that multivalued fields aren't populating the lucene field cache. Is this the correct behaviour. Regards Howard On 10 January 2011 14:55, Otis Gospodnetic <otis_gospodne...@yahoo.com>wrote: > Hi Howard, > > This is normal. Your first query is reading a bunch of index data from > disk and > your RAM is then caching it. If your first query involves sorting, some > more > data for FieldCache is being read and stored. If there are multiple sort > fields, one such thing for each. If facets are involves, more of that > stuff. > If you are optimizing your index you are likely to be forcing more disk > IO.... > > Otis > ---- > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > > > > ----- Original Message ---- > > From: Howard Lee <how...@workdigital.co.uk> > > To: solr-user@lucene.apache.org > > Sent: Mon, January 10, 2011 8:59:03 AM > > Subject: Multivalued fields and facet performance > > > > Hi, > > > > I'd appreciate some explanation on what may be going on in the following > > scenario using multivalued fields and facets. > > > > Solr version: 1.5 > > > > Our index contains 35 million docs, and our search is using 2 > multivalued > > fields as facets. There are approx 5 million different values in one > field > > and 5000 in the other. We are seeing the following, and I'm curious as > what > > is actually happening in the background. > > > > The first search can take up to 5 minutes, all subsequent queries of any > q > > return in under a second. This is fine unless you are the first search > or > > new searcher. > > > > I plan on adding a first searcher and new searcher in the config to > avoid > > long delays every time the index is updated (once a day) but I have > concerns > > of the length of the delay in launching a new searcher, and whether this > is > > causing too much overhead. > > > > Can someone explain to me what processes are going on in the backgroud > that > > cause this behaviour so I can understand the implications or make some > > adjustments in the config to compensate. > > > > thanx > > > > Howard > > > -- WORKDIGITAL LTD workdigital.co.uk 32-34 Broadwick Street W1A 2HG London, UK Howard Lee CEO M +44(0)7931 476 766 E how...@workdigital.co.uk workhound.co.uk - salarytrack.co.uk - twitterjobsearch.com - dreamjobalert.co.uk - recruitmentadnetwork.com