Patches item #1678339, was opened at 2007-03-11 18:11 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1678339&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tests Group: Python 2.6 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Denys Rtveliashvili (rtvd) Assigned to: Nobody/Anonymous (nobody) Summary: Adding a testcase for the bug in find_longest_match Initial Comment: The find_longest_match method in the difflib's SequenceMatcher has a bug. The bug is in turn caused by a problem with creating a b2j mapping which should contain a list of indices for each of the list elements in b. However, when the b2j mapping is being created (this is being done in __chain_b method in the SequenceMatcher) the mapping becomes broken. The cause of this is that for the frequently used elements the list of indices is removed and the element is being enlisted in the populardict mapping. The test case tries to match two strings like: abbbbbb.... and ...bbbbbbc The number of b is equal and the find_longest_match should have returned the proper amount. However, in case the number of "b"s is large enough, the method reports that the length of the longest common substring is 0. It simply can't find it. A bug was raised some time ago on this matter. It's ID is 1528074. I tried to fix this bug but as a result, the performance drops by a factor of 5-10. Perhaps someone more familiar with Python can find a good fix. For the time being I send this test case (which is broken until the bug is fixed) and I'm going to send a fix next (but the fix makes the method run quite slowly). The patch diff attached was made against the trunk version of Python. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1678339&group_id=5470 _______________________________________________ Patches mailing list Patches@python.org http://mail.python.org/mailman/listinfo/patches