Le Thursday 02 October 2008 14:07:50 M.-A. Lemburg, vous avez écrit : > On 2008-10-02 13:50, Victor Stinner wrote: > > This is a PEP (...) > > The PEP doesn't appear to address any potential changes. Wouldn't > it be better to add such information to the Python3 documentation > itself ?!
I don't know the right name of this document. Yeah, it may move to Doc/ in Python3 source code. > > Example of an invalid bytes sequence: :: > > >>> str(b'\xff', 'utf8') > > UnicodeDecodeError > > > > >>> str(b'\xff', 'iso-8859-1') > > 'ÿ' > > You have left out all the options you have by using a different > error handling mechanism (using a third parameter to str()), e.g. > 'replace', 'ignore', etc. Yes, I can explain why replace and ignore can *not* be use in this case. If you use ignore or replace, filenames will be valid unicode strings, but you will be unable to open / copy / remove you file. > > Default encoding > > ================ > > > > Python uses "UTF-8" as the default Unicode encoding. You can read the > > default charset using sys.getdefaultencoding(). The "default encoding" is > > used by PyUnicode_FromStringAndSize(). > > Not only there: the C API makes various assumptions on the default > encoding as well. We should probably drop the term "default encoding" > altogether and replace it with "utf-8". The concept of "default encoding" is unclear in Python. Yes, we might remove sys.getdefaultencoding() and write that PyUnicode_FromStringAndSize() uses the UTF-8 charset. > sys.setdefaultencoding() should probably be dropped altogether from > Python3. Yes. -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com