2014/1/18 Oliver Heger <oliver.he...@oliver-heger.de> > > > Am 18.01.2014 17:40, schrieb Emmanuel Bourg: > > Le 18/01/2014 16:04, Benedikt Ritter a écrit : > > > >> About putting this into codec: I still don't think this is a good fit > for > >> this contribution. Codec is about, well decoding and encoding stuff. > Jaro > >> Winkler and Levenshtein Distance are more like scores or metrics that > help > >> in comparing strings. > > > > The point is, string metrics and soundex algorithm are often used to > > find similarities between words. That's a bit odd to have them in > > separate packages. That being said, string metrics doesn't look like a > > good fit for codec since it doesn't encode anything. > > From a logic PoV I agree with Emmanuel that a separate Text component > would make sense. It could also contain other stuff like special search > algorithms or trie implementations. > > From an organizational PoV I also understand Gary: It is unlikely that > we have the energy and man power to keep such a new component alive - > except someone steps up now? > > So I am on the fence. In past we have always tried to keep [lang] very > focused and lean. >
Well these string distance metrics could be seen as an addition to java.lang.String. In this regard a StringDistanceMetrics class would fit into [lang]. > > Oliver > > > > > Emmanuel Bourg > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- http://people.apache.org/~britter/ http://www.systemoutprintln.de/ http://twitter.com/BenediktRitter http://github.com/britter