New submission from STINNER Victor:
The codecs.StreamReaderWriter() class still has old unfixed issues like the
issue #12508 (open since 2011). This issue is even seen as a security
vulnerability by the owasp-pysec project:
https://github.com/ebranca/owasp-pysec/wiki/Unicode-string-silently-truncated
I propose to modify codecs.open() to reuse the io module: call io.open() with
newline=''. The io module is now battle-tested and handles well many corner
cases of incremental codecs with multibyte encodings.
With this change, codecs.open() cannot be used with non-text encodings... but
I'm not sure that this feature ever worked in Python 3:
$ ./python -bb
Python 3.7.0a0
>>> import codecs
>>> f = codecs.open('test', 'w', encoding='rot13')
>>> f.write('hello')
TypeError: a bytes-like object is required, not 'str'
>>> f.write(b'hello')
TypeError: a bytes-like object is required, not 'dict'
The next step would be to deprecate the codecs.StreamReaderWriter class and the
codecs.open(). But my latest attempt to deprecate them was the PEP 400 and it
wasn't a full success, so I now prefer to move step by step :-)
Attached PR:
* Modify codecs.open() to use io.open()
* Remove "; use codecs.open() to handle arbitrary codecs" from io.open() and
_pyio.open() error messages
* Replace codecs.open() with open() at various places
----------
components: Unicode
messages: 289362
nosy: ezio.melotti, haypo, lemburg, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Modify codecs.open() to use the io module instead of
codecs.StreamReaderWriter()
versions: Python 3.7
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue29783>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com