On Wed, 18 May 2005 15:48:32 -0400, William Park <[EMAIL PROTECTED]> wrote:
>How do you compare 2 strings, and determine how much they are "close" to >each other? Eg. > aqwerty > qwertyb >are similar to each other, except for first/last char. But, how do I >quantify that? > >I guess you can say for the above 2 strings that > - at max, 6 chars out of 7 are same sequence --> 85% max > >But, for > qawerty > qwerbty >max correlation is > - 3 chars out of 7 are the same sequence --> 42% max 1. Google for such topics as "fuzzy matching", "edit distance", "approximate comparison". 2. Closer to home, look at the thread in comp.lang.python around 2004-11-18 -- search for "Pingel Hyyro" [and yes you do mean "hyyro", not "hydro"!!] 3. Steadfastly ignore any (presumably) well-intentioned profferings of soundex. HTH, John -- http://mail.python.org/mailman/listinfo/python-list