RE: facet method=enum and uninvertedfield limitations

Lemke, Michael SZ/HZA-ZSW Wed, 20 Nov 2013 08:11:57 -0800

On Wednesday, November 20, 2013 7:37 AM, Dmitry Kan wrote:

Thanks for your reply.


>
>Since you are faceting on a text field (is this correct?) you deal with a
>lot of unique values in it.

Yes, this is a text field and we experimented with reducing the index.  As
I said in my original question the stripped down index had 178,000 terms
and it (fc) still didn't work.  Is number of terms the relevant quantity?

>So your best bet is enum method. 

Hm, yes, that works but I have to wait 4 minutes for the answer (with the
original data).  Not good.

>Also if you
>are on solr 4x try building doc values in the index: this suits faceting
>well.

We are on Solr 1.4, so, no.

>
>Otherwise start from your spec once again. Can you use shingles instead?

Possibly but I don't know shingles.  Although I'd prefer to use our original
index we are trying to build a specialized index just for this sort of
query but still don't know what to look for.

A query like

 
q=word&facet.field=CONTENT&facet=true&facet.limit=10&facet.mincount=1&facet.method=fc&facet.prefix=a&rows=0

would give me the top ten results containing 'word' and something starting
with 'a'.  That's what I want.  An empty facet.prefix should also work.
Eventually, the query will be more complex containing other fields and
filter queries but the basic function should be exactly like this.  How
can we achieve this?

Thanks,
Michael


>On 19 Nov 2013 17:44, "Lemke, Michael SZ/HZA-ZSW" <lemke...@schaeffler.com>
>wrote:
>
>> On Friday, November 15, 2013 11:22 AM, Lemke, Michael SZ/HZA-ZSW wrote:
>>
>> Judging from numerous replies this seems to be a tough question.
>> Nevertheless, I'd really appreciate any help as we are stuck.
>> We'd really like to know what in our index causes the facet.method=fc
>> query to fail.
>>
>> Thanks,
>> Michael
>>
>> >On Thu, November 14, 2013 7:26 PM, Yonik Seeley wrote:
>> >>On Thu, Nov 14, 2013 at 12:03 PM, Lemke, Michael  SZ/HZA-ZSW
>> >><lemke...@schaeffler.com> wrote:
>> >>> I am running into performance problems with faceted queries.
>> >>> If I do a
>> >>>
>> >>>
>> q=word&facet.field=CONTENT&facet=true&facet.limit=10&facet.mincount=1&facet.method=fc&facet.prefix=a&rows=0
>> >>>
>> >>> I am getting an exception:
>> >>> org.apache.solr.common.SolrException: Too many values for
>> UnInvertedField faceting on field CONTENT
>> >>>         at
>> org.apache.solr.request.UnInvertedField.uninvert(UnInvertedField.java:384)
>> >>>         at
>> org.apache.solr.request.UnInvertedField.&lt;init&gt;(UnInvertedField.java:178)
>> >>>         at
>> org.apache.solr.request.UnInvertedField.getUnInvertedField(UnInvertedField.java:839)
>> >>>         ...
>> >>>
>> >>> I understand it's got something to do with a 24bit limit somewhere
>> >>> in the code but I don't understand enough of it to be able to construct
>> >>> a specialized index that can be queried with facet.method=enum.
>> >>
>> >>You shouldn't need to do anything differently to try facet.method=enum
>> >>(just replace facet.method=fc with facet.method=enum)
>> >
>> >This is true and facet.method=enum does work indeed.  The problem is
>> >runtime.  In particular queries with an empty facet.prefix= run many
>> >seconds if not minutes.  I initially asked about this here:
>> >
>> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201310.mbox/%3c33ec3398272fbe47b64ee3b3e98f69a761427...@de011521.schaeffler.com%3E
>> >
>> >It was suggested that fc is much faster than enum and I'd like to
>> >test that.  We are still fairly free to design the index such that
>> >it performs well.  But to do that we need to understand what is
>> >killing it.
>> >
>> >>
>> >>You may also want to add the parameter
>> >>facet.enum.cache.minDf=100000
>> >>to lower memory usage by only usiing the filter cache for terms that
>> >>match more than 100K docs.
>> >
>> >That helped a little, cut down my particular test from 10 sec to 5 sec.
>> >But still too slow.  Mind you this is for an autosuggest feature.
>> >
>> >Thanks for your reply.
>> >
>> >Michael
>> >
>> >
>>
>>

RE: facet method=enum and uninvertedfield limitations

Reply via email to