[
https://issues.apache.org/jira/browse/TEXT-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16567425#comment-16567425
]
Jan Martin Keil commented on TEXT-104:
--------------------------------------
This has earlier been discussed in LANG-1269. "Jaro Winkler Similarity" is also
used in the literature (e.g.
[1|http://www.semantic-web-journal.net/system/files/swj1128.pdf]). Winkler him
self names it "Jaro-Winkler comparator" (see
[2|http://www.census.gov/srd/papers/pdf/rrs2006-02.pdf]). However,
{{Comparator}} should not be used that way in Java context. Why not just name
it {{JaroWinklerSimilarity}} or {{JaroWinklerSimilarityScore}}. A distance is
expected to be minimal for equal strings. A similarity is expected to be
maximum for equal strings. Even if it is not a mathematical metric, I do not
see a reason to confuse users.
> Jaro Winkler Distance refers to similarity
> ------------------------------------------
>
> Key: TEXT-104
> URL: https://issues.apache.org/jira/browse/TEXT-104
> Project: Commons Text
> Issue Type: Improvement
> Affects Versions: 1.1
> Reporter: Nikos Karagiannakis
> Priority: Trivial
> Fix For: 2.0
>
>
> The 'apply' method returns the similarity score instead of the distance score
> as implied from the class name.
> It is stated in the javadoc, but it is not aligned with the approach of the
> rest similarity scores in the same package (e.g LevenshteinDetailedDistance).
> Maybe a rename of the class or the method to avoid confusion?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)