RE: Facet performance

Toke Eskildsen Fri, 18 Oct 2013 07:24:36 -0700

Lemke, Michael  SZ/HZA-ZSW [lemke...@schaeffler.com] wrote:
> 1. 
> q=word&facet.field=CONTENT&facet=true&facet.prefix=&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0
> 2. 
> q=word&facet.field=CONTENT&facet=true&facet.prefix=a&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0


> The only difference is am empty facet.prefix in the first query.

> The first query returns after some 20 seconds (QTime 20000 in the result) 
> while
> the second one takes only 80 msec (QTime 80). Why is this?

If you index was just opened when you issued your queries, the first request 
will be notably slower than the second as the facet values might not be in the 
disk cache.

Furthermore, for enum the difference between no prefix and some prefix is huge. 
As enum iterates values first (as opposed to fc that iterates hits first), 
limiting to only the values that starts with 'a' ought to speed up retrieval by 
a factor 10 or more.

> And as side note: facet.method=fc makes the queries run 'forever' and 
> eventually
> fail with org.apache.solr.common.SolrException: Too many values for 
> UnInvertedField faceting on field CONTENT.

An internal memory structure optimization in Solr limits the amount of possible 
unique values when using fc. It is not a bug as such, but more a consequence of 
a choice. Unfortunately the enum-solution is normally quite slow when there are 
enough unique values to trigger the "too many values"-exception. I know too 
little about the structures for DocValues to say if they will help here, but 
you might want to take a look at those.

- Toke Eskildsen

RE: Facet performance

Reply via email to