How to retrieve unique values in typeahead

2010-03-15 Thread Nair, Manas
Hi experts,
 
Please help me out on this.
 
I have a collection of about 30K documents which pertain to pop artists (eg. 
Madonna, Michael Jackson). These artist names are indexed in the field named 
artist_t which has the following properties in dynamic field declaration:
dynamicField name=*_t type=text indexed=true stored=true/

Most of the documents will have MJ as their artist. I am using EdgeNGram filter 
factory to get a typeahead implementation. i.e.

when I type in m I would get madonna, michael jackson, miley cyrus etc 
as results. The problem that I have now is that all these terms are repeated.

When I search for m, instead of madonna, michael jackson I am getting 
MJ repeated many times in the initiall 10 docs that solr brings by default.

I need to make all these artists unique i.e if I search m, I should get 
individual results just once?

How should I change the schema file and is there a query tweaking required?

Any help would be dearly appreciated.

Thanks and regards,

Manas



Re: How to retrieve unique values in typeahead

2010-03-15 Thread Ahmet Arslan

 I have a collection of about 30K documents which pertain to
 pop artists (eg. Madonna, Michael Jackson). These artist
 names are indexed in the field named artist_t which has
 the following properties in dynamic field declaration:
 dynamicField name=*_t type=text indexed=true
 stored=true/
 
 Most of the documents will have MJ as their artist. I am
 using EdgeNGram filter factory to get a typeahead
 implementation. i.e.
 
 when I type in m I would get madonna, michael jackson,
 miley cyrus etc as results. The problem that I have now is
 that all these terms are repeated.
 
 When I search for m, instead of madonna, michael
 jackson I am getting MJ repeated many times in the
 initiall 10 docs that solr brings by default.
 
 I need to make all these artists unique i.e if I search
 m, I should get individual results just once?
 
 How should I change the schema file and is there a query
 tweaking required?

http://wiki.apache.org/solr/TermsComponent (can be used for Auto-Suggest) can 
eliminate repeated terms. With this solution you don't need EdgeNGram anymore. 
If you want to suggest more than one term, you can add ShingleFilterFactory to 
your index analyzer chain.