On Tue, Jul 6, 2010 at 7:18 PM, Terry Reedy <tjre...@udel.edu> wrote:

> [Also posted to http://bugs.python.org/issue2986
> A much faster way to find the first mismatch would be
>   i = 0
>   while first[i] == second[i]:
>      i+=1
> The match ratio, based on the initial matching prefix only, is spuriously
> low.
>
>
I don't have much experience with the Python sequence matcher, but many
classical edit distance and alignment algorithms benefit from stripping any
common prefix and suffix before engaging in heavy-lifting.  This is
trivially optimal for Hamming-like distances and easily shown to be for
Levenshtein and Damerau type distances.

-Kevin
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to