After reading this thread, I have wrapped up a different approach, probably not what you were looking for, but it is very good for what I wanted: comparing a command string typed by a user with all possible commands a program can accept, to be able to do typo-correction. The method will return a floating point number between 0.0 (no match) and len(s)+1.0 (if the strings are exactly equal). You can see the score as "the number of equal letters".
I put the module at http://starship.python.net/crew/hooft/ It is called comparestrings.py. It is a comparison algorithm as implemented by Reinhard Schneider and Chris Sander for comparison of protein sequences, but implemented to compare two ASCII strings. The comparison makes use of a similarity matrix for letters: in the protein case this is normally a chemical functionality similarity, for our case this is a matrix based on keys next to each other on a US Qwerty keyboard and on "capital letters are similar to their lowercase equivalent" The algorithm does not cut corners: it is sure to find the absolute best match between the two strings. No attempt has been made to make this efficient in time or memory. Time taken and memory used is proportional to the product of the lengths of the input strings. Use for strings longer than 25 characters is entirely for your own risk. Regards, Rob Hooft -- http://mail.python.org/mailman/listinfo/python-list