[ 
https://issues.apache.org/jira/browse/TEXT-76?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno P. Kinoshita resolved TEXT-76.
------------------------------------
    Resolution: Fixed

Fixed by removing the Math.round, and returning the original jaro winkler 
distance. 

The jaro winkler values may vary within the decimal digits. So even fixing the 
round issue (e.g. by using BigDecimal and rounding with DOWN or FLOOR) we would 
still have cases returning 0.99 for several pairs, while if you looked at the 
original value you would be able to tell which are closer to each other.

So now we return the original value as other libraries (e.g. Python Jellyfish, 
java-string-similarity).

Cheers
Bruno

> Jaro Winkler implementation introduced in 3.5 is not correct
> ------------------------------------------------------------
>
>                 Key: TEXT-76
>                 URL: https://issues.apache.org/jira/browse/TEXT-76
>             Project: Commons Text
>          Issue Type: Bug
>    Affects Versions: 1.0
>            Reporter: Luc Boutier
>            Assignee: Bruno P. Kinoshita
>
> Using 3.5 commons-lang the following call return a distance of 1
> StringUtils.getJaroWinklerDistance(“/opt/software1”,  “/opt/software2”)
> Jaro Winkler says that distance of 1 means equal string which is not the case 
> here.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to