You should look at the the completion suggester added in 0.90.30 instead of using edgengrams. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-suggesters-completion.html http://www.elasticsearch.org/blog/you-complete-me/
On Friday, January 17, 2014 5:04:14 PM UTC+1, coder wrote: > > Hi, > > I'm trying to use elasticsearch to implement a autocompleter for my > college project just like some travel websites use it for implementing > their autocompleter but facing some issues in implementation. > > I'm using following mapping for my case:- > > curl -XPUT > 'http://localhost:9200/auto_index/<http://localhost:9200/acqindex/>' > -d '{ > "settings" : { > "index" : { > "number_of_shards" : 1, > "number_of_replicas" : 1, > "analysis" : { > "analyzer" : { > "str_search_analyzer" : { > "tokenizer" : "standard", > "filter" : ["lowercase","asciifolding"," > suggestion_shingle","edgengram"] > }, > "str_index_analyzer" : { > "tokenizer" : "standard", > "filter" : > ["lowercase","asciifolding","suggestions_shingle","edgengram"] > } > }, > "filter" : { > "suggestions_shingle": { > "type": "shingle", > "min_shingle_size": 2, > "max_shingle_size": 5 > }, > "edgengram" : { > "type" : "edgeNGram", > "min_gram" : 2, > "max_gram" : 30, > "side" : "front" > }, > "mynGram" : { > "type" : "nGram", > "min_gram" : 2, > "max_gram" : 30 > } > } > }, > "similarity" : { > "index": { > "type": > "org.elasticsearch.index.similarity.CustomSimilarityProvider" > }, > "search": { > "type": > "org.elasticsearch.index.similarity.CustomSimilarityProvider" > } > } > } > } > > curl -XPUT 'localhost:9200/auto_index/autocomplete/_mapping' -d '{ > "autocomplete":{ > "_boost" : { > "name" : "po", > "null_value" : 4.0 > }, > "properties": { > "ad": { > "type": "string", > "search_analyzer" : "str_search_analyzer", > "index_analyzer" : "str_index_analyzer", > "omit_norms": "true", > "similarity": "index" > }, > "category": { > "type": "string", > "include_in_all" : false > }, > "cn": { > "type": "string", > "search_analyzer" : "str_search_analyzer", > "index_analyzer" : "str_index_analyzer", > "omit_norms": "true", > "similarity": "index" > }, > "ctype": { > "type": "string", > "search_analyzer" : "keyword", > "index_analyzer" : "keyword", > "omit_norms": "true", > "similarity": "index" > }, > "eid": { > "type": "string", > "include_in_all" : false > }, > "st": { > "type": "string", > "search_analyzer" : "str_search_analyzer", > "index_analyzer" : "str_index_analyzer", > "omit_norms": "true", > "similarity": "index" > }, > "co": { > "type": "string", > "include_in_all" : false > }, > "st": { > "type": "string", > "search_analyzer" : "str_search_analyzer", > "index_analyzer" : "str_index_analyzer", > "omit_norms": "true", > "similarity": "index" > }, > "co": { > "type": "string", > "search_analyzer" : "str_search_analyzer", > "index_analyzer" : "str_index_analyzer", > "omit_norms": "true", > "similarity": "index" > }, > "po": { > "type": "double", > "boost": 4.0 > }, > "en":{ > "type": "boolean" > }, > "_oid":{ > "type": "long" > }, > "text": { > "type": "string", > "search_analyzer" : "str_search_analyzer", > "index_analyzer" : "str_index_analyzer", > "omit_norms": "true", > "similarity": "index" > }, > "url": { > "type": "string" > } > } > } > }' > > and then in my java code, i'm forming query like:- > > String script = "_score * (doc['po'].empty ? 1 : doc['po'].value == 0.0 ? > 1 : doc['po'].value)"; > QueryBuilder queryBuilder = QueryBuilders.customScoreQuery( > QueryBuilders.queryString(query) > .field("text",30) > .field("ad") > .field("st") > .field("cn") > .field("co") > > .defaultOperator(Operator.AND)).script(script); > > Some explanation of fields: > text: contains statements like "things to do in goa" > ad: address > st: state > cn: city name > co: country > > Now, if I type "things to do in" in my autocompleter box, i'm getting > these results: > > things to do in rann > things to do in bulandshahr > things to do in gondai > things to do in rewa > things to do in goa > > But I want "things to do in goa" on top. > > Earlier, I thought idf in Elasticsearch is creating problem, So I override > the Default similarity and created CustomSimilarity which sets idf to 1. > But it's still not solving not my problem. Instead it started giving me > results like this: > > things to do in toronto on top. > > I think may be I'm doing something wrong in my index_analyzer and > search_analyzer. I tried other tokenizers and token filters in different > order but not able to get any solution. > > I could have implemented simple prefix autocompleter but that way it > doesn't make any sense to use Elasticsearch since searching for terms in > between sentences gives user more flexibility. Also, in travel industry a > person can search for a particular thing in different manners. like instead > of searching for exactly "things to do in" he/she can also wrote "what are > the best things to do in" or "what are things to do" and many other > possibilities. That way a prefix autocompleter won't work effectively. > That's why I tried implementing autocompleter using ElasticSearch but I'm > not doing it right way. > > For better results, I also introduced a popularity factor which keeps > updating on every user click so that its score keeps increasing in every > search using custom score query. Also, giving text field 30% weightage and > lesser weightage to other fields. But something is not going right. > > I guess I'm not able to use ElasticSearch capabilities properly for my use > case. Can you please help me with this ? > > Thanks > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3fb42188-c58a-4ab0-bcb8-48c1b075eb71%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
