Hi, 

Sorry that I am relatively fresh to elasticsearch so please don't be too 
harsh.

I feel like I'm not being able to understand the behaviour of any of the 
fuzzy queries in ES.

*1) match with fuzziness enabled*

{
  "query": {
    "fuzzy_like_this_field": {
      "field_name": {
        "like_text": "car renting London",
        "fuzziness": "0.5"
      }
    }
  }
}

As I see it from my tests, this kind of query will give same score to 
documents with field_name="car renting London" and "car ranting London" or 
"car renting Londen" for example. That means, it will not give any 
negatively score misspellings. I can imagine that first the possible 
variants are computed and then the score is just computed with a 
"representative score" which is the same for every variant that match the 
requirements. 

Am I right? If I am, is it any way to boost the exact match over the fuzzy 
match?

Also I get results with more terms getting the same score, like "cheap car 
renting London", "offers car renting London". That's something I cannot get 
to understand. When I use the explain API, it seems that the resulting 
score is a sum of the different matches with its internal weightings, 
tf-idf, etc. but it seems to not be considering the terms outside the 
query, while I would expect the exact match to score at least slightly 
higher. 

Am I missing something here? Is it just the expected result and I am just 
being too demanding?

*2) fuzzy query*

That doesn't make what I want since it does not analyze the query (I think) 
and so it will treat the query in an unexpected way for my purposes of 
"free text" search

*3) fuzzy_like_this or fuzzy_like_this_field*

This other search takes rid of the first problem in point 1, since as I 
read from the documentation, it seems to use some tricks to avoid favouring 
rare terms (misspellings will be here) over more frequent terms, etc. but 
it's still giving the same score to exact match and matches where other 
terms are present. 

Is there any way to get the expected behaviour?. By this I mean to be able 
to execute almost free-text queries with some fuzziness to take rid of 
possible misspellings in the query terms, but with an (at least for me) 
more exhaustive score computation. If not, is there any other more complex 
query or a function_score to get such a performance.

Thank you very much, any comment will be pretty much appreciated. Also, if 
I am not right in my suppositions, any clarification will be very welcome.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/916f5408-ecfd-4676-8d48-db4467a9d839%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to