subject:"\[issue2986\] difflib.SequenceMatcher not matching long sequences"

[issue2986] difflib.SequenceMatcher not matching long sequences

2019-11-07 Thread Roundup Robot



Change by Roundup Robot :


--
pull_requests: +16592
pull_request: https://github.com/python/cpython/pull/17082

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2011-01-08 Thread Jesús Cea Avión


Changes by Jesús Cea Avión j...@jcea.es:


--
nosy: +jcea

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2011-01-08 Thread Terry J. Reedy


Changes by Terry J. Reedy tjre...@udel.edu:


--
stage: needs patch - committed/rejected

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-25 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

Agreed. #10534. This is really a 'follow-on' rather than 'superseder',
but the forward reference should be easy for anyone to find.

--
resolution:  - fixed
status: open - closed
superseder:  - difflib.SequenceMatcher: expose junk sets, deprecate 
undocumented isb... functions.
type: feature request - behavior
versions: +Python 2.6, Python 2.7, Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-24 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

Since I am not sure I will be able to do any more before the 3.2b1 feature 
freeze, I went ahead with the minimal patch after checking the differences from 
the 2.7 version and redoing the Misc/News entry.
(I suspect putting a new entry immediately after the appropriate heading, 
instead of between other headings, is probably least likely to fatally conflict 
with intervening changes.) r86745 Thank you Eli and Simon.

Leaving this open for possible further changes.

--
type: behavior - feature request

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-24 Thread Simon Cross


Simon Cross hodges...@gmail.com added the comment:

My vote is that this bug be closed and a new feature request be opened. Failing 
that, it would be good to have a concise description of what else we would like 
done (and the priority should be downgraded, I guess).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-24 Thread Eli Bendersky


Eli Bendersky eli...@gmail.com added the comment:

Terry, I agree with Simon re closing and opening a new feature request. This 
issue has too much baggage in it, and you we always link to it. A new feature 
request should be opened strictly for 3.2

If you want I can close this issue and open a new one, but I'm waiting for your 
approval.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-21 Thread Eli Bendersky


Eli Bendersky eli...@gmail.com added the comment:

Simon's patch fix for 3.2 looks good to me - applies cleanly to py3k and tests 
pass.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-20 Thread Simon Cross


Simon Cross hodges...@gmail.com added the comment:

I made the minor changes needed to get Eli Bendersky's patch to apply against 
3.2. Diff attached.

--
nosy: +hodgestar
Added file: http://bugs.python.org/file19675/issue2986.fix32.5.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-20 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

Deadline is probably next Fri. However I will apply this or slight revision 
thereof in a couple of days to make sure this much is in. I have to fixup some 
work stuff today.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-19 Thread Eli Bendersky


Eli Bendersky eli...@gmail.com added the comment:

Terry, when is the deadline for producing the patch for 3.2? Perhaps we should 
at least submit the 2.7 patch for now so that it goes in for sure?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-12 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

r86437 - correct and replicate version-added message

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-11 Thread Terry J. Reedy


Changes by Terry J. Reedy tjre...@udel.edu:


--
versions:  -Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-11 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

Tim told me to continue with this as he has no time.
rev86401 - apply 3.1 doc fix

--
assignee: tim_one - terry.reedy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-11 Thread Eli Bendersky


Eli Bendersky eli...@gmail.com added the comment:

Attaching a new patch for 2.7 freshly generated vs. current 2.7 maintenance 
branch from SVN.

--
Added file: http://bugs.python.org/file19569/issue2986.fix27.5.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-11 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

Tim told me to continue with this as he has no time.
rev86401 - apply 3.1 doc fix

I cannot apply 2.7 patch. I has different header lines. In particular, 
TortoiseSVN cannot fetch nonexistent revision Mon Aug 30 06:37:52 2010 +0300. 
Please regenerate against current 2.7 with method used for 2.6/3.1.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-11 Thread Terry J. Reedy


Changes by Terry J. Reedy tjre...@udel.edu:


--
Removed message: http://bugs.python.org/msg120925

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-11 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

issue2986.fix27.5.patch applied, with version note added to doc, as
rev86418

Only thing left is patch for 3.2, which Eli and I will produce.

--
stage: commit review - needs patch
versions:  -Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-07 Thread Eli Bendersky


Eli Bendersky eli...@gmail.com added the comment:

Adding a documentation patch for 3.1 which is similar to the 2.6 documentation 
patch that's been committed by Georg into 2.6

--
Added file: http://bugs.python.org/file19538/issue2986.docs31.1.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-09-07 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

The patch changes the internal function that constructs the dict mapping b 
items to indexes to read as follows:
  create b2j mapping
  if isjunk function, move junk items to junk set
  if autojunk, move popular items to popular set

I helped write and test the 2.7 patch and verify that default behavior remains 
unchanged. I believe it is ready to commit.

3.1 and 3.2 patches will follow.

--
stage: unit test needed - commit review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-09-02 Thread Eli Bendersky


Eli Bendersky eli...@gmail.com added the comment:

Attaching a patch (developed jointly with Terry Reedy) for 2.7 that adds an 
'autojunk' parameter to SequenceMatcher's constructor. The parameter is True by 
default which retains the current behavior in 2.6 and earlier, but can be set 
by the user to False to disable the popularity heuristic. The patch also fixes 
some documentation inconsistencies that Terry raised in this message.

Notes:
1. Tests run successfully. Added new test class in test_difflib for testing 
with the new autojunk parameter False
2. Patch generated vs. Hg mirror

--
Added file: http://bugs.python.org/file18719/issue2986.fix27.4.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-09-01 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

While refactoring the code for 2.7, I discovered that the description of the 
heuristic for 2.6 and in the code comments is off by 1. items that appear more 
than 1% of the time should actually be items whose duplicates (after the 
first) appear more than 1% of the time. The discrepancy arises because in the 
following code

for i, elt in enumerate(b):
if elt in b2j:
indices = b2j[elt]
if n = 200 and len(indices) * 100  n:
populardict[elt] = 1
del indices[:]
else:
indices.append(i)
else:
b2j[elt] = [i]

len(indices) is retrieved *before* the index i of the current elt is added. 
Whatever one might think the heuristic 'should' have been (and by the nature of 
heuristics, there is no right answer), the default behavior must remain as it 
is, so we adjusted the code and doc to match that.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-08-02 Thread Barry A. Warsaw


Barry A. Warsaw ba...@python.org added the comment:

Georg committed this patch to the 2.6 tree, and besides, this is doesn't seem 
like a blocking issue, so I'm kicking 2.6 off the list and knocking the 
priority down.

--
priority: release blocker - high
versions:  -Python 2.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-31 Thread Georg Brandl


Georg Brandl ge...@python.org added the comment:

Deferring to after 3.2a1.

--
priority: release blocker - deferred blocker

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-31 Thread Georg Brandl


Georg Brandl ge...@python.org added the comment:

Committed 2.6 patch in r83314.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-31 Thread Georg Brandl


Changes by Georg Brandl ge...@python.org:


--
priority: deferred blocker - release blocker

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-23 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

For 2.6 and 3.1, this is a documentation only issue.
For 2.7, this is a doc + behavior issue.
For 3.2, this is a doc + behavior + new feature issue.

For 2.6.6 (release candidate due Aug 2, 10 days), I propose to add the 
following paragraph after the current 'Timing:' paragraph in the 
SequenceMatcher entry ('Heuristic:' should be bold-faced, like 'Timing:')

Heuristic: To speed matching, items that appear more than 1% of the time in 
sequences of at least 200 items are treated as junk. This has the unfortunate 
side-effect of giving bad results for sequences constructed from a small set of 
items. An option to turn off the heuristic will be added to a future version.

I would have said 'to 2.7.1' but that has not happened yet. I thought about 
putting the heuristic paragraph first, but I think it fits better after the 
discussion of quadratic run time. I think it should be a separate paragraph and 
not tacked on the end of the previous paragraph so people will be more likely 
to take notice.

I have marked this a release blocker because at least 6 issues have been filed 
for this bug and so I think it important that the explanation be added to the 
next released doc. I plan to temporarily reassign this to d...@python in a few 
days.

--
nosy: +barry
priority: normal - release blocker
type: feature request - behavior
versions: +Python 2.6, Python 2.7, Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-23 Thread Eli Bendersky


Eli Bendersky eli...@gmail.com added the comment:

Here's a patch for Doc/library/difflib.rst of the 2.6 branch, following Terry's 
suggested addition to the docs of the SequenceMatcher class.

Tested 'make html'.

--
keywords: +patch
Added file: http://bugs.python.org/file18171/issue2986.docs26.1.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-14 Thread Antoine Pitrou


Antoine Pitrou pit...@free.fr added the comment:

Le mercredi 14 juillet 2010 à 01:45 +, Terry J. Reedy a écrit :
 
 2. Add a parameter that defaults to using the heuristic but allows
 turning it off. Perhaps better, but code that used the new API would
 crash if run on 2.7.0

Yes, but this is an exceptional situation. We normally don't add new
APIs in bugfix versions. We'll have to live with it.

 3.
 [...]
 Ugly, but perhaps crazy brilliant. Use of such a hack would obviously
 be temporary. Perhaps its use could be made to issue a -3 warning if
 such were enabled.

It's still incredibly ugly. Besides, code written for 2.7.1 might not
blow up with 2.7, but it will still have different behaviour.
If you are using the new parameter, it's because you *need* it, hence
different behaviour will be unacceptable; therefore, better to raise an
error as the API change proposal does.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-13 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

[copied from pydev post]

Summary: adding an autojunk heuristic to difflib without also adding a way to 
turn it off was a bug because it disabled running code.

2.6 and 3.1 each have, most likely, one final version each. Don't fix for these 
but add something to the docs explaining the problem and future fix.

2.7 will have several more versions over several years and will be used by 
newcomers who might encounter the problem but not know to diagnose it and patch 
a private copy of the module. So it should have a fix.  Solutions thought of so 
far.

1. Modify the heuristic to somewhat fix the problem. Bad (unacceptable) because 
this would silently change behavior and could break tests.

2. Add a parameter that defaults to using the heuristic but allows turning it 
off. Perhaps better, but code that used the new API would crash if run on 2.7.0

3.
Tim Peters
 Think the most pressing thing is to give people a way to turn the damn
 thing off.  An ugly way would be to trigger on an unlikely
 input-output behavior of the existing isjunk argument.  For example,
 if
 
  isjunk(what's the airspeed velocity of an unladen swallow?)
 
 returned
 
  don't use auto junk!
 
 and 2.7.1 recognized that as meaning don't use auto junk, code could
 be written under 2.7.1 that didn't blow up under 2.7.  It could
 _behave_ differently, although that's true of any way of disabling the
 auto-junk heuristics.

Ugly, but perhaps crazy brilliant. Use of such a hack would obviously be 
temporary. Perhaps its use could be made to issue a -3 warning if such were 
enabled.

I would simplify the suggestion to something like
isjunk(disable!heuristic) == True
so one could pass
lambda s:s==disable!heuristic
It should be something easy to document and write. This issue is the only place 
such a string should appear, so it should be safe.

Tim and Antoine: if you two can agree on what to do for 2.7, Eli and I will 
code it.

This suggestion amounts to a suggestion that the fix for 2.7 be decoupled from 
a better fix for 3.2. I agree. The latter can be discussed once 2.7 is settled.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-08 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

Anyone can post on Python-dev, but non-developers should do so judiciously and 
with respect for the purpose of the list. It is also polite to introduce 
oneself with the first post. In any case, Tim Peters has approved making some 
change. The remaining question is exactly what.

There is no problem with extending the API in 3.2. The debate there is over 2.7.

My fourth proposal, detailed on pydev, is to introduce a fourth paramater, 
'common', to set the frequency threshold to None or int 1-99.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-08 Thread Antoine Pitrou


Antoine Pitrou pit...@free.fr added the comment:

 There is no problem with extending the API in 3.2. The debate there is
 over 2.7.

We could extend the API as long as it stays backwards-compatible (that
is, the default value for the new argument produces the same behaviour
as before).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-08 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

My proposal F, to expose the common frequency threshold as a fourth positional 
parameter with default 1, would do that: repeat current behavior. We should, 
and Eli and I would, add some of the anomalous cases to the test suite and 
verily that the default is to reproduce the current anomalies, and that passing 
None changes the result.

Any opinions, anyone, on 'common', 'thresh', 'threshold', or anything else as 
the new parameter name?

We will have to explain in the doc patch that the parameter is new in 2.7.1 to 
fix a partial bug and that giving any explicit value will make code not run 
with 2.7 (.0).

Exposing the set of common values as an instance attribute, as I proposed on 
pydev, would be a new feature not needed to fix the bug. So it should be 
limited to 3.2.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-07 Thread Vlastimil Brom


Vlastimil Brom vlastimil.b...@gmail.com added the comment:

I guess, I am not supposed to post to python-dev - not being a python 
developer, hopefully it is appropriate to add a comment here - only based on my 
current usage of (a modified) difflib.SequenceMatcher.
It seems, the mentions of text comparison in that thread, e.g. 
http://mail.python.org/pipermail/python-dev/2010-July/101515.html
etc. rather imply line-by-line comparison, and possibly character comparison of 
matched lines.
For me the direct character-wise comparison is more useful in most cases.
With the popular heuristics disabled the results look pretty well.
(the script only involves changing the background colour of the compared texts 
- based on the SequenceMatcher - get_opcodes() )
Just now, I only need to disable the popular check, currently I use a 
monkey-patched subclass of SequenceMatcher with extended signature and modified 
__chain_b function.
cf. http://mail.python.org/pipermail/python-list/2010-June/1247907.html

I would vote for extending the SequenceMatcher API to enable adjustments 
(leaving the default values as the current ones) - enable/disable popular 
check, set the thresholds for string length and popular frequency (and 
eventually other parameters, which might be added).

Are there some restrictions on API changes in a library due to a moratorium - 
even if the default behaviour remains unchanged?
Otherwise, what might be the disadvantages of this approach?
If the current behaviour is considered appropriate for the original usecases, 
other uses would be also made possible/easier - only at the cost of learning 
the meaning of the added parameters - from the enhanced docs, of course.

vbr

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-07 Thread Terry J. Reedy


Changes by Terry J. Reedy tjre...@udel.edu:


--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-07 Thread Terry J. Reedy


Changes by Terry J. Reedy tjre...@udel.edu:


--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-06 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

[Also posted to pydev for additional input, with Subject line
Issue 2986: difflib.SequenceMatcher is partly broken
Developed with input from Eli Bendersky, who will write patchfile(s) for 
whichever change option is chosen.]

Summary: difflib.SeqeunceMatcher was developed, documented, and originally 
operated as a flexible class for comparing pairs of sequences of any 
[hashable] type. An experimental heuristic was added in 2.3a1 to speed up 
its application to sequences of code lines, which are selected from an 
unbounded set of possibilities. As explained below, this heuristic partly to 
completely disables SequenceMatcher for realistic-length sequences from a small 
finite alphabet. The regression is easy to fix. The docs were never changed to 
reflect the effect of the heuristic, but should be, with whatever additional 
change is made.

In the commit message for revision 26661, which added the heuristic, Tim Peters 
wrote While I like what I've seen of the effects so far, I still consider this 
experimental.  Please give it a try! Several people who have tried it 
discovered the problem with small alphabets and posted to the tracker. Issues 
#1528074, #1678339. #1678345, and #4622 are now-closed duplicates of #2986. The 
heuristic needs revision.

Open questions (discussed after the examples): what exactly to do, which 
versions to do it too, and who will do it.

---
Some minimal difference examples:

from difflib import SequenceMatcher as SM

# base example
print(SM(None, 'x' + 'y'*199, 'y'*199).ratio())
# should be and is 0.9975 (rounded)

# make 'y' junk
print(SM(lambda c:c=='y', 'x' + 'y'*199, 'y'*199).ratio())
# should be and is 0.0

# Increment b by 1 char
print(SM(None, 'x' + 'y'*199, 'y'*200).ratio())
# should be .995, but now is 0.0 because y is treated as junk

# Reverse a and b, which increments b
print(SM(None, 'y'*199, 'x' + 'y'*199).ratio())
# should be .9975, as before, but now is 0.0 because y is junked

The reason for the bug is the heuristic: if the second sequence is at least 200 
items long then any item occurring more than one percent of the time in the 
second sequence is treated as junk. This was aimed at recurring code lines like 
'else:' and 'return', but can be fatal for small alphabets where common items 
are necessary content.

A more realistic example than the above is comparing DNA gene sequences. 
Without the heuristic SequenceMatcher.get_opcodes() reports an appropriate 
sequence of matches and edits and .ratio works as documented and expected.  For 
1000/2000/6000 bases, the times on a old Athlon 2800 machine are 1/2/12 
seconds. Since 6000 is longer than most genes, this is a realistic and 
practical use.

With the heuristic, everything is junk and there is only one match, ''=='' 
augmented by the initial prefix of matching bases. This is followed by one 
edit: replace the rest of the first sequence with the rest of the second 
sequence. A much faster way to find the first mismatch would be
   i = 0
   while first[i] == second[i]:
  i+=1
The match ratio, based on the initial matching prefix only, is spuriously low.

---
Questions:

1: what change should be make.

Proposed fix: Disentangle the heuristic from the calculation of the internal 
b2j dict that maps items to indexes in the second sequence b. Only apply the 
heuristic (or not) afterward.

Version A: Modify the heuristic to only eliminate common items when there are 
more than, say, 100 items (when len(b2j) 100 where b2j is first calculated 
without popularity deletions).

The would leave DNA, protein, and printable ascii+[\n\r\t] sequences alone. On 
the other hand, realistic sequences of more than 200 code lines should have at 
least 100 different lines, and so the heuristic should continue to be applied 
when it (mostly?) 'should' be. This change leaves the API unchanged and does 
not require a user decision.

Version B: add a parameter to .__init__ to make the heuristic optional. If the 
default were True ('use it'), then the code would run the same as now (even 
when bad). With the heuristic turned off, users would be able to get the .ratio 
they may expect and need. On the other hand, users would have to understand the 
heuristic to know when and when not to use it. 

Version C: A more radical alternative would be to make one or more of the 
tuning parameters user settable, with one setting turning it off.

2. What type of issue is this, and what version get changed.

I see the proposal as partial reversion of a change that sometimes causes a 
regression, in order to fix the regression. Such would usually be called a 
bugfix. Other tracker reviewers claim this issue is a feature request, not a 
bugfix. Either way, 3.2 gets the fix. The practical issue is whether at least 
2.7(.1) should get the fix, or whether the bug should forever continue in 2.x.

3. Who will make the change.

Eli will write a patch and I will check it. However, Georg Brandel

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-06 Thread Eli Bendersky


Eli Bendersky eli...@gmail.com added the comment:

Thanks!
Now let's see what the other devs say. The first response seems not to have
understood what you meant completely :-)

Eli

On Wed, Jul 7, 2010 at 01:18, Terry J. Reedy rep...@bugs.python.org wrote:


 Terry J. Reedy tjre...@udel.edu added the comment:

 [Also posted to pydev for additional input, with Subject line
 Issue 2986: difflib.SequenceMatcher is partly broken
 Developed with input from Eli Bendersky, who will write patchfile(s) for
 whichever change option is chosen.]

 Summary: difflib.SeqeunceMatcher was developed, documented, and originally
 operated as a flexible class for comparing pairs of sequences of any
 [hashable] type. An experimental heuristic was added in 2.3a1 to speed up
 its application to sequences of code lines, which are selected from an
 unbounded set of possibilities. As explained below, this heuristic partly to
 completely disables SequenceMatcher for realistic-length sequences from a
 small finite alphabet. The regression is easy to fix. The docs were never
 changed to reflect the effect of the heuristic, but should be, with whatever
 additional change is made.

 In the commit message for revision 26661, which added the heuristic, Tim
 Peters wrote While I like what I've seen of the effects so far, I still
 consider this experimental.  Please give it a try! Several people who have
 tried it discovered the problem with small alphabets and posted to the
 tracker. Issues #1528074, #1678339. #1678345, and #4622 are now-closed
 duplicates of #2986. The heuristic needs revision.

 Open questions (discussed after the examples): what exactly to do, which
 versions to do it too, and who will do it.

 ---
 Some minimal difference examples:

 from difflib import SequenceMatcher as SM

 # base example
 print(SM(None, 'x' + 'y'*199, 'y'*199).ratio())
 # should be and is 0.9975 (rounded)

 # make 'y' junk
 print(SM(lambda c:c=='y', 'x' + 'y'*199, 'y'*199).ratio())
 # should be and is 0.0

 # Increment b by 1 char
 print(SM(None, 'x' + 'y'*199, 'y'*200).ratio())
 # should be .995, but now is 0.0 because y is treated as junk

 # Reverse a and b, which increments b
 print(SM(None, 'y'*199, 'x' + 'y'*199).ratio())
 # should be .9975, as before, but now is 0.0 because y is junked

 The reason for the bug is the heuristic: if the second sequence is at least
 200 items long then any item occurring more than one percent of the time in
 the second sequence is treated as junk. This was aimed at recurring code
 lines like 'else:' and 'return', but can be fatal for small alphabets where
 common items are necessary content.

 A more realistic example than the above is comparing DNA gene sequences.
 Without the heuristic SequenceMatcher.get_opcodes() reports an appropriate
 sequence of matches and edits and .ratio works as documented and expected.
  For 1000/2000/6000 bases, the times on a old Athlon 2800 machine are
 1/2/12 seconds. Since 6000 is longer than most genes, this is a realistic
 and practical use.

 With the heuristic, everything is junk and there is only one match, ''==''
 augmented by the initial prefix of matching bases. This is followed by one
 edit: replace the rest of the first sequence with the rest of the second
 sequence. A much faster way to find the first mismatch would be
   i = 0
   while first[i] == second[i]:
  i+=1
 The match ratio, based on the initial matching prefix only, is spuriously
 low.

 ---
 Questions:

 1: what change should be make.

 Proposed fix: Disentangle the heuristic from the calculation of the
 internal b2j dict that maps items to indexes in the second sequence b. Only
 apply the heuristic (or not) afterward.

 Version A: Modify the heuristic to only eliminate common items when there
 are more than, say, 100 items (when len(b2j) 100 where b2j is first
 calculated without popularity deletions).

 The would leave DNA, protein, and printable ascii+[\n\r\t] sequences alone.
 On the other hand, realistic sequences of more than 200 code lines should
 have at least 100 different lines, and so the heuristic should continue to
 be applied when it (mostly?) 'should' be. This change leaves the API
 unchanged and does not require a user decision.

 Version B: add a parameter to .__init__ to make the heuristic optional. If
 the default were True ('use it'), then the code would run the same as now
 (even when bad). With the heuristic turned off, users would be able to get
 the .ratio they may expect and need. On the other hand, users would have to
 understand the heuristic to know when and when not to use it.

 Version C: A more radical alternative would be to make one or more of the
 tuning parameters user settable, with one setting turning it off.

 2. What type of issue is this, and what version get changed.

 I see the proposal as partial reversion of a change that sometimes causes a
 regression, in order to fix the regression. Such would usually be called a
 bugfix. Other tracker reviewers claim this

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-06 Thread Eli Bendersky


Changes by Eli Bendersky eli...@gmail.com:


Removed file: http://bugs.python.org/file17891/unnamed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-06 Thread Eli Bendersky


Eli Bendersky eli...@gmail.com added the comment:

I apologize for the previous message. It was created by mistake - by replying 
to Terry's mail which came from the bugtracker.

I wish I knew how to remove it from here - is this possible and I'm missing the 
relevant priveleges?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-02 Thread Eli Bendersky


Eli Bendersky eli...@gmail.com added the comment:

The new junk heuristic has been added to difflib.py in SVN revision 26661 in 
2002 (which is, incidentally, the last revision to modify difflib.py). Its 
commit log says:

-
Mostly in SequenceMatcher.{__chain_b, find_longest_match}:
This now does a dynamic analysis of which elements are so frequently
repeated as to constitute noise.  The primary benefit is an enormous
speedup in find_longest_match, as the innermost loop can have factors
of 100s less potential matches to worry about, in cases where the
sequences have many duplicate elements.  In effect, this zooms in on
sequences of non-ubiquitous elements now.

While I like what I've seen of the effects so far, I still consider
this experimental.  Please give it a try!
-

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-06-28 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

The discussion on #152807 references two other closed tracker issues:
#1678339 Test case that currently fails
#1678345 Patch to change behavior - rejected because crippled behavior is 
supposedly intentional and removing the change would slow things down.

The patch simply removes the internal heuristic. I think a better patch would 
be to make it optional, with a tunable popularity threshold.

I say 'supposedly intentional' because the code comments only justify the 
popularity hack for code line comparison and give no indication of awareness 
that it disables SequenceMatcher for general purpose use, and in particular, 
for non-toy finite character set comparisons of the type (ascii) used in all 
the examples.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-06-25 Thread Terry J. Reedy


Terry J. Reedy tjre...@udel.edu added the comment:

This appears to be one of at least three duplicate issues: #1528074, #2986, and 
#4622. I am closing two, leaving 2986 open, and merging the nearly disjoint 
nosy lists. (If no longer interested, you can delete yourself from 2986.) 
#1711800 appears to be slightly different (if not, it could be closed also.)

Whether or not a new feature is ever added (earliest, now, 3.2), it appears 
that the docs need improvement to at least explain the current behavior. If 
someone who understands the issue could open a separate doc issue (for 
2.6/7/3.1/2) with a suggested addition, that would be great.

--
nosy: +LambertDW, eliben, gagenellina, janpf, jimjjewett, rtvd, sjmachin, 
tjreedy
versions:  -Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-04-19 Thread Vlastimil Brom


Vlastimil Brom vlastimil.b...@gmail.com added the comment:

I just stumbled on some seemingly different unexpected behaviour of
difflib.SequenceMatcher, but it turns out, it may have the same cause, i.e. the 
popular heuristics.
I hopefully managed to replicate it on an illustrative sample text - in as 
included in the attached file. (I also mentioned this issue in hte python-list 
http://mail.python.org/pipermail/python-list/2010-April/1241951.html but as 
there were no replies I eventually found, this might be more appropriate place.)
Both strings differ in a minimal way, each having one extra character
in a strategic position, which probably meets some pathological case
for difflib.
Instead of just reporting the insertion and deletion of these single
characters (which works well for most cases - with most other
positions of the differing characters), the output of the
SequenceMatcher decides to delete a large part of the string in
between the differences and to insert the almost same text after that.
The attached code simply prints the results of the comparison with the
respective tags, and substrings. No junk function is used.
I get the same results on Python 2.5.4, 2.6.5, 3.1.1 on windows XPp SP3.
I didn't find any plausible mentions of such cases in the documentation, but 
after some searching I found several reports in the bug tracker mentioning the 
erroneous output of SequenceMatcher on longer repetitive sequences.

besides this
http://bugs.python.org/issue2986
e.g.
http://bugs.python.org/issue1711800
http://bugs.python.org/issue4622
http://bugs.python.org/issue1528074

In my case, disabling the popular heuristics as mentioned by John Machin in
http://bugs.python.org/issue1528074#msg29269

seems to have solved the problem; with a modified version of difflib containing:

if 0:   # disable popular heuristics
if n = 200 and len(indices) * 100  n:
populardict[elt] = 1
del indices[:]

the comparison catches the differences in the test strings as expected - i.e. 
one character addition and deletion only. It is likely, that some other use 
cases for difflib may rely on the popular-heuristics but it also seems useful 
to have some control over this behaviour, which might not be appropriate in all 
cases.
(The issue seems to be the same in python 2.5, 2.6 and 3.1.)

regards,
   vbr

--
nosy: +vbr
Added file: http://bugs.python.org/file17001/difflib_test_inq.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2009-10-02 Thread Antoine Pitrou


Antoine Pitrou pit...@free.fr added the comment:

The popularity heuristic could be tuned to depend on the number N of
distinct elements in the sequence, and kick in if an element appears say
more than 1/(N**0.5) of the time.

--
nosy: +pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2009-10-01 Thread Geoffrey Bache


Changes by Geoffrey Bache gjb1...@users.sourceforge.net:


--
nosy: +gjb1002

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2009-05-27 Thread R. David Murray


Changes by R. David Murray rdmur...@bitdance.com:


--
components: +Documentation, Library (Lib) -Extension Modules
priority:  - normal
stage:  - test needed
type:  - feature request
versions: +Python 3.2 -Python 2.5

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2009-03-29 Thread Georg Brandl


Georg Brandl ge...@python.org added the comment:

Tim, I think you've had some enlightening comments about difflib issues
in the past.

--
assignee:  - tim_one
nosy: +georg.brandl, tim_one

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2009-03-29 Thread Mike Rotondo


Mike Rotondo mroto...@gmail.com added the comment:

From the source, it seems that there is undocumented behavior to
SequenceMatcher which is causing this error. If b is longer than 200
characters, it will consider any element x in b that takes up more than
1% of it's contents as popular, and thus junk. 

So, in this case, difflib is treating each individual digit as an
element of your sequences, and each one takes up more than 1% of the
complete sequence b. Therefore, each one is popular, and therefore
ignored.

A snippet which demonstrates this:

from difflib import SequenceMatcher
for i in range(1, 202)[::10]:
  a = a * i
  b = b + a * i
  s = SequenceMatcher(None, a, b)
  print s.find_longest_match(0, len(a), 0, len(b))

Up til i=200, the strings match, but afterwards they do not because a
is popular. 

Strangely, if you get rid of the b at the beginning of b, they
continue to match at lengths greater than 200. This may be a bug, I'll
keep looking into it but someone who knows more should probably take a
look too.

The comments from difflib.py say some interesting things:
 # b2j also does not contain entries for popular elements, meaning 
 # elements that account for more than 1% of the total elements, and
 # when the sequence is reasonably large (= 200 elements); this can
 # be viewed as an adaptive notion of semi-junk, and yields an enormous
 # speedup when, e.g., comparing program files with hundreds of
 # instances of return NULL;

This seems to mean that you won't actually get an accurate diff in
certain cases, which seems odd. At the very least, this behavior should
probably be documented. Do people think it should be changed to get rid
of the popularity heuristic?

--
nosy: +mrotondo
versions: +Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2009-03-29 Thread R. David Murray


R. David Murray rdmur...@bitdance.com added the comment:

On Mon, 30 Mar 2009 at 00:40, Mike Rotondo wrote:
 This seems to mean that you won't actually get an accurate diff in
 certain cases, which seems odd. At the very least, this behavior should
 probably be documented. Do people think it should be changed to get rid
 of the popularity heuristic?

A better way, I think, would be to provide a way to turn
it off (and then document it, of course).

--
nosy: +bitdancer

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2986
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2986] difflib.SequenceMatcher not matching long sequences

2008-05-27 Thread Nate


New submission from Nate [EMAIL PROTECTED]:

The following code shows no matches though the strings clearly match.

from difflib import * 

a =
'''39043203381559556628573221727792187279924711093861125152794523529732793117520068565885125032447020125028126531603069277213510312502702798781521250210814711252468946033191629862834564694482932523354428149539640297186717055152464370568794560959154441746654640262554157367545426801783736754129988985714104837148017837367541448283617148017837367541330684087148017837367541408596657148017837367541538510044714801783736754157158643714106907148017837367541474888907148017837362059576680178373675454488017831041705391546777051025363147367544777801783736754152171032271480178373675417378111377148017837367541727911516714801783736754176929952714801783736754175759835714801783736754173989658714801783104170550264677705512355737056879456095915445625329640826754157363006104258329145203115148103015957219995715478978791137801783736189510219832803777819819892374989136789814142131989249498926799891648825778109447511028842170482589787911378017831041705118365420736273279818012793603261597148017837361!
 
71798080178310415420736447510213871790638471586131412631592131012571210126718031314200414571314893700123874777987006697747115770067074789312578013869801783104120529166337056879456095918495136604565251349544838956219513495753741344870733943253617458316356794745831634651172458316348316144586052838244151360641656349118903581890331689038658903263218549028909605134957536316060'''
b =
'''46343203381559556628573221727792187279924711093861125152794523529732793117520068565885125032447020125028126531603069277213510312502702798781521250210814711252468946033191629862834564694482932523354428149539640297186717055152464370568794560959154441746654640262554157367545426801783736754129988985714104837148017837367541448283617148017837367541330684087148017837367541408596657148017837367541538510044714801783736754157158643714106907148017837367541474888907148017837362059576680178373675454488017831041705391546777051025363147367544777801783736754131821081171480178373675417378111377148017837367541727911516714801783736754176929952714801783736754175759835714801783736754173989658714801783104170550264677705512355737056879456095915445625329640826754157363006104258329145203115148103015957219995715478978791137801783736189510219832803777819819892374989136789814142131989249498926799891648825778109447511028842170482589787911378017831041705118365420736273279818012793603261597148017837361!
 
71798080178310415420736447510213871790638471412131420041457131485122165131466702097131466731723131466741536131466751581131466771649131466761975131467212090131467261974131467231858131467201556131467212538131467221553131467221943131467231748131466711452131467271787131412578013869801783104154307361718482280178373638585436251621338931320893185072980138084820801545115716861861152948618615002682261422349251058108327767521397977810837298017831041205291663370568794560959184951366045652513495448389562195134957537413448707339432536174583163'''
lst = [(a,b)]
for a, b in lst:
print ---
s = SequenceMatcher(None, a, b)
print length of a is %d % len(a)
print length of b is %d % len(b)
print s.find_longest_match(0, len(a), 0, len(b))
print s.ratio()
for block in s.get_matching_blocks():
m = a[block[0]:block[0]+block[2]]
print a[%d] and b[%d] match for %d elements and it is \%s\ %
(block[0], block[1], block[2], m)

--
components: Extension Modules
messages: 67428
nosy: hagna
severity: normal
status: open
title: difflib.SequenceMatcher not matching long sequences
versions: Python 2.5

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2986
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

51 matches

Mail list logo