Serhiy Storchaka added the comment:
I think the changeset which made decoders to use _PyUnicodeWriter (issue16311)
is responsible of the regression.
For example consider b'\x80abc'.decode('utf-8', 'backslashreplace').
The writer reserves string buffer with size 4 (every byte produces at most 1
character). First byte is incorrect and replaced by 4-character string
'\\x80'. The writer increases min_length but doesn't resize the buffer because
its size is enough to write replacement string. But following writes of ASCII
characters cause buffer overflow.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue23321>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com