Dennis Sweeney <[email protected]> added the comment:
Indeed, this is just a very unlucky case.
>>> n = len(longer)
>>> from collections import Counter
>>> Counter(s[:n])
Counter({0: 9056995, 255: 6346813})
>>> s[n-30:n+30].replace(b'\x00', b'.').replace(b'\xff', b'@')
b'..............................@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@'
>>> Counter(s[n:])
Counter({255: 18150624})
When checking "base", we're in this situation
pattern: @@@@@@@@
string: .........@@@@@@@@
Algorithm says: ^ these last characters don't match.
^ this next character is not in the pattern
Therefore, skip ahead a bunch:
pattern: @@@@@@@@
string: .........@@@@@@@@
This is a match!
Whereas when checking "longer", we're in this situation:
pattern: @@@@@@@@@
string: .........@@@@@@@@
Algorithm says: ^ these last characters don't match.
^ this next character *is* in the pattern.
We can't jump forward.
pattern: @@@@@@@@
string: .........@@@@@@@@
Start comparing at every single alignment...
I'm attaching reproducer.py, which replicates this from scratch without loading
data from a file.
----------
Added file: https://bugs.python.org/file49499/reproducer.py
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue41972>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com