Hi,
can somebody help, please? Maybe you can at least answer parts of my
question?
I'd expect that somebody at least knows the limitations of faceting with
UninvertedField?
Thank you,
Andreas
Andreas Hubold wrote on 04.04.2013 13:30:
Hi,
we've successfully implemented suggestion of search terms using facet
prefixing with Solr 4.0. However, with lots of unique index terms
we've encountered performance problems (long running queries) and even
exceptions: "Too many values for UnInvertedField faceting on field
textbody".
We must provide suggestions based on a prefix entered by the user. The
solution should use the terms from an indexed text field. Furthermore
the suggestions must be filtered according to some specified filter
queries.
Do you have any performance tips for facet prefixing or know how to
avoid the above exception even in the case of many unique terms?
What is causing the above exception: a) the total number of unique
terms in the field or b) the number of unique terms in the field of a
single document
If b), is there a way to find such documents easily? Do you know how
many unique terms can be handled without problems by facet prefixing?
I've read the blog post
http://www.searchworkings.org/blog/-/blogs/different-ways-to-make-auto-suggestions-with-solr
which describes NGrams as another possible approach to implement
suggestions with filtering. I would expect that this approach provides
better query performance (at the cost of increased index size).
However I haven't found detailed information how to implement it. I
know how to configure a field for ngrams and how to perform a query
using that field. But the results just give me the document but not
the matched terms. Or am I expected to use a stored field and inspect
its value?
I also found this blog post where the Highlighter is used in
combination with ngrams to provide suggestions:
http://solr.pl/en/2013/02/25/autocomplete-on-multivalued-fields-using-highlighting/
Can this be used to get the suggested terms from a document? What
about performance? Will such an approach perform better than facet
prefixing for large text fields with lots of unique terms?
Any hints appreciated.
Thank you,
Andreas