Re: Jarow-Winkler algorithm: Measuring similarity between strings

2008-12-20 Thread Øyvind
Thanks for the useful comments. On 20 Des, 01:38, John Machin sjmac...@lexicon.net wrote: On Dec 20, 10:02 am, Øyvind oyvin...@gmail.com wrote: Based on examples and formulas fromhttp://en.wikipedia.org/wiki/Jaro-Winkler. For another Python implementation, google febrl. Useful for

Re: Jarow-Winkler algorithm: Measuring similarity between strings

2008-12-20 Thread John Machin
On Dec 20, 7:07 pm, Øyvind oyvin...@gmail.com wrote: Thanks for the useful comments. On 20 Des, 01:38, John Machin sjmac...@lexicon.net wrote: On Dec 20, 10:02 am, Øyvind oyvin...@gmail.com wrote: Based on examples and formulas fromhttp://en.wikipedia.org/wiki/Jaro-Winkler. For

Re: Jarow-Winkler algorithm: Measuring similarity between strings

2008-12-20 Thread bearophileHUGS
John Machin: This paper by Heikki Hyyrö is well worth reading, and refers to a whole lot of previous work, including Ukkonen's: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.6.2242 This is the site of the author: http://www.cs.uta.fi/~helmu/pubs/pubs.html There you can find updates

Jarow-Winkler algorithm: Measuring similarity between strings

2008-12-19 Thread Øyvind
Based on examples and formulas from http://en.wikipedia.org/wiki/Jaro-Winkler. Useful for measuring similarity between two strings. For example if you want to detect that the user did a typo. def jarow(s1,s2): Returns a number between 1 and 0, where 1 is the most similar

Re: Jarow-Winkler algorithm: Measuring similarity between strings

2008-12-19 Thread John Machin
On Dec 20, 10:02 am, Øyvind oyvin...@gmail.com wrote: Based on examples and formulas fromhttp://en.wikipedia.org/wiki/Jaro-Winkler. For another Python implementation, google febrl. Useful for measuring similarity between two strings. For example if you want to detect that the user did a typo.

Re: Jarow-Winkler algorithm: Measuring similarity between strings

2008-12-19 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Øyvind wrote: Based on examples and formulas from http://en.wikipedia.org/wiki/Jaro-Winkler. Useful for measuring similarity between two strings. For example if you want to detect that the user did a typo. Jaro-Winkler is best when dealing with