Otis,
The reason I ask is that I run a number of sites on Solr, some with 10
million+ docs faceting on similar types of data, and have not seen anywhere
near this length of initial delay. The main difference is that these sites
facet on single value fields rather that multivalued and that this site is
searching on 3 times the volume of data. Would switching to single valued
(I'd rather not) make much of a  difference.

I've also noticed that multivalued fields aren't populating the lucene field
cache. Is this the correct behaviour.

Regards

Howard

On 10 January 2011 14:55, Otis Gospodnetic <otis_gospodne...@yahoo.com>wrote:

> Hi Howard,
>
> This is normal.  Your first query is reading a bunch of index data from
> disk and
> your RAM is then caching it.  If your first query involves sorting, some
> more
> data for FieldCache is being read and stored.  If there are multiple sort
> fields, one such thing for each.  If facets are involves, more of that
> stuff.
> If you are optimizing your index you are likely to be forcing more disk
> IO....
>
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> ----- Original Message ----
> > From: Howard Lee <how...@workdigital.co.uk>
> > To: solr-user@lucene.apache.org
> > Sent: Mon, January 10, 2011 8:59:03 AM
> > Subject: Multivalued fields and facet performance
> >
> > Hi,
> >
> > I'd appreciate some explanation on what may be going on in the  following
> > scenario using multivalued fields and facets.
> >
> > Solr version:  1.5
> >
> > Our index contains 35 million docs, and our search is using 2
>  multivalued
> > fields as facets. There are approx 5 million different values in  one
> field
> > and 5000 in the other. We are seeing the following, and I'm curious  as
> what
> > is actually happening in the background.
> >
> > The first search can  take up to 5 minutes, all subsequent queries of any
> q
> > return in under a  second. This is fine unless you are the first search
> or
> > new  searcher.
> >
> > I plan on adding a first searcher and new searcher in the  config to
> avoid
> > long delays every time the index is updated (once a day) but  I have
> concerns
> > of the length of the delay in launching a new searcher, and  whether this
> is
> > causing too much overhead.
> >
> > Can someone explain to me  what processes are going on in the backgroud
> that
> > cause  this behaviour  so I can understand the implications or make some
> > adjustments in the config  to compensate.
> >
> > thanx
> >
> > Howard
> >
>



-- 
WORKDIGITAL LTD
workdigital.co.uk
32-34 Broadwick Street
W1A 2HG London, UK

Howard Lee
CEO

M  +44(0)7931 476 766
E  how...@workdigital.co.uk

workhound.co.uk - salarytrack.co.uk - twitterjobsearch.com -
dreamjobalert.co.uk - recruitmentadnetwork.com

Reply via email to