Re: How to use ElasticSearch to implement Autocompleter ?

Mukul Gupta Fri, 17 Jan 2014 08:24:59 -0800

But the problem still remains. The completion suggester will give you
results only if there is an exact match but as previously mentioned there
can be many types of queries which can be done by a user at travel website.


Thanks


On Fri, Jan 17, 2014 at 9:41 PM, joa <[email protected]> wrote:

> You should look at the the completion suggester added in 0.90.30 instead
> of using edgengrams.
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-suggesters-completion.html
> http://www.elasticsearch.org/blog/you-complete-me/
>
>
> On Friday, January 17, 2014 5:04:14 PM UTC+1, coder wrote:
>>
>> Hi,
>>
>> I'm trying to use elasticsearch to implement a autocompleter  for my
>> college project just like some travel websites use it for implementing
>> their autocompleter but facing some issues in implementation.
>>
>> I'm using following mapping for my case:-
>>
>> curl -XPUT 
>> 'http://localhost:9200/auto_index/<http://localhost:9200/acqindex/>'
>> -d '{
>>      "settings" : {
>>         "index" : {
>>             "number_of_shards" : 1,
>>             "number_of_replicas" : 1,
>>             "analysis" : {
>>                "analyzer" : {
>>                   "str_search_analyzer" : {
>>                       "tokenizer" : "standard",
>>                       "filter" : ["lowercase","asciifolding","
>> suggestion_shingle","edgengram"]
>>                    },
>>                    "str_index_analyzer" : {
>>                      "tokenizer" : "standard",
>>                      "filter" : ["lowercase","asciifolding","
>> suggestions_shingle","edgengram"]
>>                   }
>>                },
>>                "filter" : {
>>                    "suggestions_shingle": {
>>                        "type": "shingle",
>>                        "min_shingle_size": 2,
>>                        "max_shingle_size": 5
>>                   },
>>                   "edgengram" : {
>>                       "type" : "edgeNGram",
>>                       "min_gram" : 2,
>>                       "max_gram" : 30,
>>                       "side"     : "front"
>>                   },
>>                   "mynGram" : {
>>                         "type" : "nGram",
>>                         "min_gram" : 2,
>>                         "max_gram" : 30
>>                   }
>>               }
>>           },
>>           "similarity" : {
>>                      "index": {
>>                              "type": "org.elasticsearch.index.similarity.
>> CustomSimilarityProvider"
>>                      },
>>                      "search": {
>>                              "type": "org.elasticsearch.index.similarity.
>> CustomSimilarityProvider"
>>                      }
>>           }
>>      }
>>   }
>>
>> curl -XPUT 'localhost:9200/auto_index/autocomplete/_mapping' -d '{
>>     "autocomplete":{
>>        "_boost" : {
>>             "name" : "po",
>>             "null_value" : 4.0
>>        },
>>        "properties": {
>>                 "ad": {
>>                     "type": "string",
>>                     "search_analyzer" : "str_search_analyzer",
>>                     "index_analyzer" : "str_index_analyzer",
>>                     "omit_norms": "true",
>>                     "similarity": "index"
>>                 },
>>                 "category": {
>>                     "type": "string",
>>                     "include_in_all" : false
>>                 },
>>                 "cn": {
>>                     "type": "string",
>>                     "search_analyzer" : "str_search_analyzer",
>>                     "index_analyzer" : "str_index_analyzer",
>>                     "omit_norms": "true",
>>                     "similarity": "index"
>>                 },
>>                 "ctype": {
>>                     "type": "string",
>>                     "search_analyzer" : "keyword",
>>                     "index_analyzer" : "keyword",
>>                     "omit_norms": "true",
>>                     "similarity": "index"
>>                 },
>>                 "eid": {
>>                     "type": "string",
>>                     "include_in_all" : false
>>                 },
>>                 "st": {
>>                     "type": "string",
>>                     "search_analyzer" : "str_search_analyzer",
>>                     "index_analyzer" : "str_index_analyzer",
>>                     "omit_norms": "true",
>>                     "similarity": "index"
>>                 },
>>                 "co": {
>>                     "type": "string",
>>                     "include_in_all" : false
>>                 },
>>                 "st": {
>>                     "type": "string",
>>                     "search_analyzer" : "str_search_analyzer",
>>                     "index_analyzer" : "str_index_analyzer",
>>                     "omit_norms": "true",
>>                     "similarity": "index"
>>                 },
>>                 "co": {
>>                     "type": "string",
>>                     "search_analyzer" : "str_search_analyzer",
>>                     "index_analyzer" : "str_index_analyzer",
>>                     "omit_norms": "true",
>>                     "similarity": "index"
>>                 },
>>                 "po": {
>>                     "type": "double",
>>                     "boost": 4.0
>>                 },
>>                 "en":{
>>                     "type": "boolean"
>>                 },
>>                 "_oid":{
>>                     "type": "long"
>>                 },
>>                 "text": {
>>                     "type": "string",
>>                     "search_analyzer" : "str_search_analyzer",
>>                     "index_analyzer" : "str_index_analyzer",
>>                     "omit_norms": "true",
>>                     "similarity": "index"
>>                 },
>>                 "url": {
>>                     "type": "string"
>>                 }
>>          }
>>      }
>> }'
>>
>> and then in my java code, i'm forming query like:-
>>
>> String script = "_score * (doc['po'].empty ? 1 : doc['po'].value == 0.0 ?
>> 1 : doc['po'].value)";
>>         QueryBuilder queryBuilder = QueryBuilders.customScoreQuery(
>>                                         QueryBuilders.queryString(query)
>>                                             .field("text",30)
>>                                              .field("ad")
>>                                             .field("st")
>>                                             .field("cn")
>>                                             .field("co")
>>                                             .defaultOperator(Operator.AND)
>> ).script(script);
>>
>>  Some explanation of fields:
>> text: contains statements like "things to do in goa"
>> ad: address
>> st: state
>> cn: city name
>> co: country
>>
>> Now, if I type "things to do in" in  my autocompleter box, i'm getting
>> these results:
>>
>> things to do in rann
>> things to do in bulandshahr
>> things to do in gondai
>> things to do in rewa
>> things to do in goa
>>
>> But I want "things to do in goa" on top.
>>
>> Earlier, I thought idf in Elasticsearch is creating problem, So I
>> override the Default similarity and created CustomSimilarity which sets idf
>> to 1. But it's still not solving not my problem. Instead it started giving
>> me results like this:
>>
>> things to do in toronto on top.
>>
>> I think may be I'm doing something wrong in my index_analyzer and
>> search_analyzer. I tried other tokenizers and token filters in different
>> order but not able to get any solution.
>>
>> I could have implemented simple prefix autocompleter but that way it
>> doesn't make any sense to use Elasticsearch since searching for terms in
>> between sentences gives user more flexibility. Also, in travel industry a
>> person can search for a particular thing in different manners. like instead
>> of searching for exactly "things to do in" he/she can also wrote "what are
>> the best things to do in" or "what are things to do" and many other
>> possibilities. That way a prefix autocompleter won't work effectively.
>> That's why I tried implementing autocompleter using ElasticSearch but I'm
>> not doing it right way.
>>
>> For better results, I also introduced a popularity factor which keeps
>> updating on every user click so that its score keeps increasing in every
>> search using custom score query. Also, giving text field 30% weightage and
>> lesser weightage to other fields. But something is not going right.
>>
>> I guess I'm not able to use ElasticSearch capabilities properly for my
>> use case. Can you please help me with this ?
>>
>> Thanks
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/3fb42188-c58a-4ab0-bcb8-48c1b075eb71%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAAVTvp5LwHRzvMgL2iDKLK4m002oCic%2BZZ4%2B4VoG1HPzRaOeog%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: How to use ElasticSearch to implement Autocompleter ?

Reply via email to