Hi Blab,
I also want to return a score based on levenshtein distance from a fuzzy
query. Can you elaborate more on "writing a (native) script to handle the
scoring.", please? Did you actually write a script that calculates the
distance or did you use some ES properties?
Thank you,
On Thursday, March 14, 2013 7:52:02 PM UTC+2, the blab wrote:
>
> Thanks for your response. By "against the min_similarity" I meant the
> minimum value for the similarity of the fuzzy terms, i.e. the
> min_similarity parameter provided in the query I posted.
>
> To clarify there are two "scores" being calculated in the query: the
> "levenshtein distance" to determine what terms to use, and the actual
> scoring of the returned results. I wanted the levenshtein distance to be
> used to score the returned results, but I don't think this is possible.
>
> For future readers I solved this issue by creating a custom score query
> and writing a (native) script to handle the scoring.
>
> Thanks
>
> On Thursday, March 14, 2013 7:02:03 AM UTC, simonw wrote:
>>
>> Hey,
>>
>> this is not entirely true. The FuzzyQuery uses the Levenshtein Distance
>> to find the terms in the index that are subsequentially used in a Boolean
>> OR query or in a ConstantScore Filter depending on the rewrite method you
>> choose. The default also just takes the top 50 terms within a certain LD
>> and then builds a query out of it. The scoring will just be the similarity
>> of you scoring model so TF/IDF (VectorSpace) by default.
>>
>> I don't understand your last sentence, what do you mean by 'against the
>> min_similarity'?
>>
>> simon
>>
>> On Tuesday, March 12, 2013 6:45:09 PM UTC+1, the blab wrote:
>>>
>>> Hi,
>>>
>>> I have a question about scoring for fuzzy queries. If I understand
>>> correctly, fuzzy queries find any appropriate matches by calculating
>>> similarity using the levenshtein distance, but this similarity value is not
>>> used when calculating the document's score. Instead the document's score is
>>> based on the tf/idf of the matched term. Is this correct? Is it possible to
>>> instead score based on similarity to the queried term for fuzzy queries?
>>> E.g. I have the below custom_score query. I'd like the score returned to be
>>> the similarity score used to evaluate against the min_similarity.
>>>
>>> {
>>> "query": {
>>> "custom_score" : {
>>> "query": {
>>> "fuzzy": {
>>> "firstname": {
>>> "value": "Jack",
>>> "min_similarity": "0.5",
>>> "max_expansions": 1
>>> }
>>> }
>>> },
>>> "script" : "_score"
>>> }
>>> }
>>> }
>>>
>>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1012a30f-cdc9-4170-8b3f-c83866e2425d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.