Jan Martin Keil created TEXT-130:
------------------------------------
Summary: JaroWinklerDistance: Wrong results due to precision of
transpositions
Key: TEXT-130
URL: https://issues.apache.org/jira/browse/TEXT-130
Project: Commons Text
Issue Type: Bug
Reporter: Jan Martin Keil
The method {{JaroWinklerDistance#matches}} returns {{transpositions / 2}} as
integer. However, it is not granted for {{transpositions}} to be even. E.g.
comparing "aaabcd" and "aaacdb" will result in {{transpositions}} = 3.
Therefore the method must return 1.5, not 1. Otherwise the similarity is
0.9611111111111111 instead of 0.9416666666666667.
I recommend to return {{halfTranspositions}} instead of {{transpositions}} and
doing the cast and division ({{(double) mtp[1] / 2}}) in
{{JaroWinklerDistance#apply}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)