Tim Peters <t...@python.org> added the comment:

We can't change defaults without superb reason - Python has millions of users, 
and changing the output of code "that works" is almost always a non-starter.

Improvements to the docs are welcome.

In your example, try running this code after using autojunk=True:

    pending = ""
    for ch in first:
        if ch in sm.bpopular:
            if pending:
                print(repr(pending))
                pending = ""
        else:
            pending += ch
    print(repr(pending))

That shows how `first` is effectively broken into tiny pieces given that the 
"popular" chaaracters act like walls. Here's the start of the output:

'\nUN'
'QUESTR'
'NG\nL'
'x'
'f'
'.'
'L'
'b'
"'"
'x'
'v'
'1500'
','

and on & on. `QUESTER' is the longest common contiguous substring remaining.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue46667>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to