[ 
https://issues.apache.org/jira/browse/LANG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871966#comment-13871966
 ] 

Benedikt Ritter commented on LANG-944:
--------------------------------------

Hello Rekha,

I've had a brief look at your patch. One thing I dn't understand is, why you 
call your new method "getSimilarityScore". Maybe we should call it according to 
the algorithm that is implemented, like "getJaroWinklerScore". Thinking this 
further, maybe there is room for a new class called StringAlgorithms, that 
serves as a host for algorithms like LevenshteinDistance and Janko Winkler. 
I'll think about this some more and maybe bring this up on the ML.

Would be nice to have some more junit tests, that show, that corner cases also 
work. Like passing null, passing empty string, passing identical strings, etc. 
At least all examples that you give in the JavaDoc (which is very well) should 
be included as test case.

You have tested the followig:

{code:java}
assertEquals(0, new Double(0.87).compareTo(new 
Double(StringUtils.getSimilarityScore("PENNSYLVANIA", "PENCILVANYA") )) );
{code}

which does what it is intended to do, but I think it's much more readable to 
write:

{code:java}
assertEquals(0.87d, StringUtils.getSimilarityScore("PENNSYLVANIA", 
"PENCILVANYA"), 0.0d);
{code}

Do you have a Individual Contributor Licence Agreement filled with the ASF? It 
is not an absolute requirement for contributing, but if you intend to 
contribute on a regular basis, it would be good to file one. You can read about 
it at http://www.apache.org/licenses/.

Thanks for contributing!

> Add a feature of SimilarityMatch in StringUtils 
> ------------------------------------------------
>
>                 Key: LANG-944
>                 URL: https://issues.apache.org/jira/browse/LANG-944
>             Project: Commons Lang
>          Issue Type: New Feature
>          Components: lang.*
>            Reporter: Rekha Joshi
>             Fix For: 3.3, Review Patch, Discussion
>
>         Attachments: LANG-944.1.patch
>
>
> Add SimilarityMatch algorithm to evaluate a similarity matching ratio between 
> two strings.
> double matchscore = StringUtils.calculateSimilarityMatching(String s1, String 
> s2)
> I have a patch ready with implementation of similaritymatch.
> This happens to be a usual need in science algorithm and directly using 
> commons lang3 library for these string operation would be neat.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to