Eli Bendersky <eli...@gmail.com> added the comment:

The new "junk heuristic" has been added to difflib.py in SVN revision 26661 in 
2002 (which is, incidentally, the last revision to modify difflib.py). Its 
commit log says:

---------------------------------------------
Mostly in SequenceMatcher.{__chain_b, find_longest_match}:
This now does a dynamic analysis of which elements are so frequently
repeated as to constitute noise.  The primary benefit is an enormous
speedup in find_longest_match, as the innermost loop can have factors
of 100s less potential matches to worry about, in cases where the
sequences have many duplicate elements.  In effect, this zooms in on
sequences of non-ubiquitous elements now.

While I like what I've seen of the effects so far, I still consider
this experimental.  Please give it a try!
---------------------------------------------

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue2986>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to