You can index a term in mutliple ways with the suggestion completer. See (
http://www.elasticsearch.org/blog/you-complete-me/<http://www.google.com/url?q=http%3A%2F%2Fwww.elasticsearch.org%2Fblog%2Fyou-complete-me%2F&sa=D&sntz=1&usg=AFQjCNE7l1bQE4K3E-uZpWW1Las-1VRrQA>),
they are showing hotel bookings as use case!
curl -X PUT localhost:9200/hotels/hotel/1 -d '
{
"name" : "Mercure Hotel Munich",
"city" : "Munich",
"name_suggest" : {
"input" : [
"Mercure Hotel Munich",
"Mercure Munich",
"ADD OTHER WORD COMBINATIONS HERE..."
]
}
}'
If you mean by exact matches you also want fuzzy suggests (e.g. suggest
even with misspelling) you can set the the fuzzy param:
curl -X POST 'localhost:9200/music/_suggest?pretty' -d '{
"song-suggest" : {
"text" : "n",
"completion" : {
"field" : "suggest",
"fuzzy" : {
"edit_distance" : 2
}
}
}
}'
On Friday, January 17, 2014 5:20:36 PM UTC+1, coder wrote:
>
> But the problem still remains. The completion suggester will give you
> results only if there is an exact match but as previously mentioned there
> can be many types of queries which can be done by a user at travel website.
>
> Thanks
>
>
> On Fri, Jan 17, 2014 at 9:41 PM, joa <[email protected] <javascript:>>wrote:
>
>> You should look at the the completion suggester added in 0.90.30 instead
>> of using edgengrams.
>>
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-suggesters-completion.html
>> http://www.elasticsearch.org/blog/you-complete-me/
>>
>>
>> On Friday, January 17, 2014 5:04:14 PM UTC+1, coder wrote:
>>>
>>> Hi,
>>>
>>> I'm trying to use elasticsearch to implement a autocompleter for my
>>> college project just like some travel websites use it for implementing
>>> their autocompleter but facing some issues in implementation.
>>>
>>> I'm using following mapping for my case:-
>>>
>>> curl -XPUT
>>> 'http://localhost:9200/auto_index/<http://localhost:9200/acqindex/>'
>>> -d '{
>>> "settings" : {
>>> "index" : {
>>> "number_of_shards" : 1,
>>> "number_of_replicas" : 1,
>>> "analysis" : {
>>> "analyzer" : {
>>> "str_search_analyzer" : {
>>> "tokenizer" : "standard",
>>> "filter" : ["lowercase","asciifolding","
>>> suggestion_shingle","edgengram"]
>>> },
>>> "str_index_analyzer" : {
>>> "tokenizer" : "standard",
>>> "filter" : ["lowercase","asciifolding","
>>> suggestions_shingle","edgengram"]
>>> }
>>> },
>>> "filter" : {
>>> "suggestions_shingle": {
>>> "type": "shingle",
>>> "min_shingle_size": 2,
>>> "max_shingle_size": 5
>>> },
>>> "edgengram" : {
>>> "type" : "edgeNGram",
>>> "min_gram" : 2,
>>> "max_gram" : 30,
>>> "side" : "front"
>>> },
>>> "mynGram" : {
>>> "type" : "nGram",
>>> "min_gram" : 2,
>>> "max_gram" : 30
>>> }
>>> }
>>> },
>>> "similarity" : {
>>> "index": {
>>> "type": "org.elasticsearch.index.
>>> similarity.CustomSimilarityProvider"
>>> },
>>> "search": {
>>> "type": "org.elasticsearch.index.
>>> similarity.CustomSimilarityProvider"
>>> }
>>> }
>>> }
>>> }
>>>
>>> curl -XPUT 'localhost:9200/auto_index/autocomplete/_mapping' -d '{
>>> "autocomplete":{
>>> "_boost" : {
>>> "name" : "po",
>>> "null_value" : 4.0
>>> },
>>> "properties": {
>>> "ad": {
>>> "type": "string",
>>> "search_analyzer" : "str_search_analyzer",
>>> "index_analyzer" : "str_index_analyzer",
>>> "omit_norms": "true",
>>> "similarity": "index"
>>> },
>>> "category": {
>>> "type": "string",
>>> "include_in_all" : false
>>> },
>>> "cn": {
>>> "type": "string",
>>> "search_analyzer" : "str_search_analyzer",
>>> "index_analyzer" : "str_index_analyzer",
>>> "omit_norms": "true",
>>> "similarity": "index"
>>> },
>>> "ctype": {
>>> "type": "string",
>>> "search_analyzer" : "keyword",
>>> "index_analyzer" : "keyword",
>>> "omit_norms": "true",
>>> "similarity": "index"
>>> },
>>> "eid": {
>>> "type": "string",
>>> "include_in_all" : false
>>> },
>>> "st": {
>>> "type": "string",
>>> "search_analyzer" : "str_search_analyzer",
>>> "index_analyzer" : "str_index_analyzer",
>>> "omit_norms": "true",
>>> "similarity": "index"
>>> },
>>> "co": {
>>> "type": "string",
>>> "include_in_all" : false
>>> },
>>> "st": {
>>> "type": "string",
>>> "search_analyzer" : "str_search_analyzer",
>>> "index_analyzer" : "str_index_analyzer",
>>> "omit_norms": "true",
>>> "similarity": "index"
>>> },
>>> "co": {
>>> "type": "string",
>>> "search_analyzer" : "str_search_analyzer",
>>> "index_analyzer" : "str_index_analyzer",
>>> "omit_norms": "true",
>>> "similarity": "index"
>>> },
>>> "po": {
>>> "type": "double",
>>> "boost": 4.0
>>> },
>>> "en":{
>>> "type": "boolean"
>>> },
>>> "_oid":{
>>> "type": "long"
>>> },
>>> "text": {
>>> "type": "string",
>>> "search_analyzer" : "str_search_analyzer",
>>> "index_analyzer" : "str_index_analyzer",
>>> "omit_norms": "true",
>>> "similarity": "index"
>>> },
>>> "url": {
>>> "type": "string"
>>> }
>>> }
>>> }
>>> }'
>>>
>>> and then in my java code, i'm forming query like:-
>>>
>>> String script = "_score * (doc['po'].empty ? 1 : doc['po'].value == 0.0
>>> ? 1 : doc['po'].value)";
>>> QueryBuilder queryBuilder = QueryBuilders.customScoreQuery(
>>> QueryBuilders.queryString(query)
>>> .field("text",30)
>>> .field("ad")
>>> .field("st")
>>> .field("cn")
>>> .field("co")
>>>
>>> .defaultOperator(Operator.AND)).script(script);
>>>
>>> Some explanation of fields:
>>> text: contains statements like "things to do in goa"
>>> ad: address
>>> st: state
>>> cn: city name
>>> co: country
>>>
>>> Now, if I type "things to do in" in my autocompleter box, i'm getting
>>> these results:
>>>
>>> things to do in rann
>>> things to do in bulandshahr
>>> things to do in gondai
>>> things to do in rewa
>>> things to do in goa
>>>
>>> But I want "things to do in goa" on top.
>>>
>>> Earlier, I thought idf in Elasticsearch is creating problem, So I
>>> override the Default similarity and created CustomSimilarity which sets idf
>>> to 1. But it's still not solving not my problem. Instead it started giving
>>> me results like this:
>>>
>>> things to do in toronto on top.
>>>
>>> I think may be I'm doing something wrong in my index_analyzer and
>>> search_analyzer. I tried other tokenizers and token filters in different
>>> order but not able to get any solution.
>>>
>>> I could have implemented simple prefix autocompleter but that way it
>>> doesn't make any sense to use Elasticsearch since searching for terms in
>>> between sentences gives user more flexibility. Also, in travel industry a
>>> person can search for a particular thing in different manners. like instead
>>> of searching for exactly "things to do in" he/she can also wrote "what are
>>> the best things to do in" or "what are things to do" and many other
>>> possibilities. That way a prefix autocompleter won't work effectively.
>>> That's why I tried implementing autocompleter using ElasticSearch but I'm
>>> not doing it right way.
>>>
>>> For better results, I also introduced a popularity factor which keeps
>>> updating on every user click so that its score keeps increasing in every
>>> search using custom score query. Also, giving text field 30% weightage and
>>> lesser weightage to other fields. But something is not going right.
>>>
>>> I guess I'm not able to use ElasticSearch capabilities properly for my
>>> use case. Can you please help me with this ?
>>>
>>> Thanks
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/3fb42188-c58a-4ab0-bcb8-48c1b075eb71%40googlegroups.com
>> .
>>
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d060acb6-eb00-4a35-b707-7d626844f220%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.