[jira] [Updated] (TEXT-130) JaroWinklerDistance: Wrong results due to precision of transpositions

2018-10-18 Thread Pascal Schumacher (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEXT-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pascal Schumacher updated TEXT-130:
---
Affects Version/s: 1.4

> JaroWinklerDistance: Wrong results due to precision of transpositions
> -
>
> Key: TEXT-130
> URL: https://issues.apache.org/jira/browse/TEXT-130
> Project: Commons Text
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Jan Martin Keil
>Assignee: Rob Tompkins
>Priority: Major
> Fix For: 1.5
>
>
> The method {{JaroWinklerDistance#matches}} returns {{transpositions / 2}} as 
> integer. However, it is not granted for {{transpositions}} to be even. E.g. 
> comparing "aaabcd" and "aaacdb" will result in {{transpositions}} = 3. 
> Therefore the method must return 1.5, not 1. Otherwise the similarity is 
> 0.9611 instead of 0.9417.
> I recommend to return {{halfTranspositions}} instead of {{transpositions}} 
> and doing the cast and division ({{(double) mtp[1] / 2}}) in 
> {{JaroWinklerDistance#apply}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEXT-130) JaroWinklerDistance: Wrong results due to precision of transpositions

2018-08-08 Thread Rob Tompkins (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEXT-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rob Tompkins updated TEXT-130:
--
Assignee: Rob Tompkins

> JaroWinklerDistance: Wrong results due to precision of transpositions
> -
>
> Key: TEXT-130
> URL: https://issues.apache.org/jira/browse/TEXT-130
> Project: Commons Text
>  Issue Type: Bug
>Reporter: Jan Martin Keil
>Assignee: Rob Tompkins
>Priority: Major
>
> The method {{JaroWinklerDistance#matches}} returns {{transpositions / 2}} as 
> integer. However, it is not granted for {{transpositions}} to be even. E.g. 
> comparing "aaabcd" and "aaacdb" will result in {{transpositions}} = 3. 
> Therefore the method must return 1.5, not 1. Otherwise the similarity is 
> 0.9611 instead of 0.9417.
> I recommend to return {{halfTranspositions}} instead of {{transpositions}} 
> and doing the cast and division ({{(double) mtp[1] / 2}}) in 
> {{JaroWinklerDistance#apply}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)