Toke Eskildsen [mailto:t...@statsbiblioteket.dk] wrote:
>Lemke, Michael  SZ/HZA-ZSW [lemke...@schaeffler.com] wrote:
>> 1. 
>> q=word&facet.field=CONTENT&facet=true&facet.prefix=&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0
>> 2. 
>> q=word&facet.field=CONTENT&facet=true&facet.prefix=a&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0
>
>> The only difference is am empty facet.prefix in the first query.
>
>> The first query returns after some 20 seconds (QTime 20000 in the result) 
>> while
>> the second one takes only 80 msec (QTime 80). Why is this?
>
>If you index was just opened when you issued your queries, the first request 
>will be notably slower than the second as the facet values might not be in 
the disk cache.

I know but it shouldn't be orders of magnitudes as in this example, should it?

>
>Furthermore, for enum the difference between no prefix and some prefix is 
>huge. As enum iterates values first (as opposed to fc that iterates hits 
>first), limiting to only the values that starts with 'a' ought to speed up 
>retrieval by a factor 10 or more.

Thanks.  That is what we sort of figured but it's good to know for sure.  Of 
course it begs the question if there is a way to speed this up?

>
>> And as side note: facet.method=fc makes the queries run 'forever' and 
>> eventually
>> fail with org.apache.solr.common.SolrException: Too many values for 
>> UnInvertedField faceting on field CONTENT.
>
>An internal memory structure optimization in Solr limits the amount of 
>possible unique values when using fc. It is not a bug as such, but more a 
>consequence of a choice. Unfortunately the enum-solution is normally quite 
>slow when there are enough unique values to trigger the "too many 
>values"-exception. I know too little about the structures for DocValues to say 
>if they will help here, but you might want to take a look at those.

What is DocValues?  Haven't heard of it yet.  And yes, the fc method was 
terribly slow in a case where it did work.  Something like 20 minutes whereas 
enum returned within a few seconds.

Michael

Reply via email to