Binh, I have a doubt in the above explanation. You mentioned that after suggestions_shingle: "hotels in hosur" becomes "hotels", "in", "hosur", "hotels in", "hotels in hosur", "in hosur" but shouldn't be "hotels" , "in", "hosur" should be removed since my min_shingle_size is 2 or is it like the original tokens will stay always.
Also, after edgengram-tokenizer I'm getting "ho" many times so in my final output will there be only one "ho" ? or multiple "ho" because by default after every filter "unique" token filter is used by ElasticSearch, please correct me if I'm wrong ? How can I use unique token filter to remove the repeated tokens after final processing ? I tried adding "unique" after "edgengram" but it is not working ? On Wed, Jan 29, 2014 at 11:52 PM, Binh Ly <[email protected]> wrote: > Coder, > > The best way to understand what an analyzer is doing is by using the > _analyze api. For example if you do something like this: > > curl -XGET ' > http://localhost:9200/auto_index/_analyze?analyzer=str_search_analyzer&pretty&text=hotels%20in%20hosur > ' > > It will tell you how that text is analyzed. In your mapping, the analyzer > does suggestions_shingle and edgengram. The suggestions_shingle does the > shingle token filter ( > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-shingle-tokenfilter.html) > so for example: > > "hotels in hosur" becomes "hotels", "in", "hosur", "hotels in", "hotels in > hosur", "in hosur" > > Then your edgengram does the edge ngram token filter ( > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html) > so for example: > > "hotels" becomes "ho", "hot", "hote", etc... > "in" becomes "in" > "hosur" becomes "ho", "hos", "hosu", etc... > etc... > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/1e8826a3-ccb9-4987-9372-88c3967b7d68%40googlegroups.com > . > > For more options, visit https://groups.google.com/groups/opt_out. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAAVTvp6HK4_iWQC_hMwKijLBGNRXUa3CcHW_YeTFSH1MYBDEyw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
