Tim Peters <t...@python.org> added the comment:
SequenceMatcher looks for the longest _contiguous_ match. "UNIQUESTRING" isn't the longest by far when autojunk is False, but is the longest when autojunk is True. All those bpopular characters then effectively prevent finding a longer match than 'QUESTR' (capital 'I" is also in bpopular) directly. The effects of autojunk can be surprising, and it would have been better if it were False by default. But I don't see anything unexpected here. Learn from experience and force it to False yourself ;-) BTW, it was introduced as a way to greatly speed comparing files of code, viewing them as sequences of lines. In that context, autojunk is rarely surprising and usually helpful. But it more often backfires when comparing strings (viewed as sequences of characters) :-( ---------- nosy: +tim.peters _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue46667> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com