-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Øyvind wrote: > Based on examples and formulas from http://en.wikipedia.org/wiki/Jaro-Winkler. > Useful for measuring similarity between two strings. For example if > you want to detect that the user did a typo.
Jaro-Winkler is best when dealing with names (Winkler works for the US census). There are pure Python and C accelerated implementations at http://bitpim.svn.sourceforge.net/viewvc/bitpim/trunk/bitpim/src/native/strings/ If you are concerned about typos then taking into account the keyboard layout will help. For example for a user with a US keyboard, the 'a' or 'd' keys would be a common typo for 's'. Also consider Levenshtein distance: http://en.wikibooks.org/wiki/Algorithm_implementation/Strings/Levenshtein_distance Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAklMUEkACgkQmOOfHg372QRTlQCfUoebzX2HRbQ4wLVZ6yRFMHd7 9yMAnjovqefVuQenX0zpHwn/rvv9FLe+ =bACc -----END PGP SIGNATURE----- -- http://mail.python.org/mailman/listinfo/python-list