On Wed, Jan 31, 2018 at 3:03 AM, Serhiy Storchaka <storch...@gmail.com> wrote:
> 19.01.18 05:51, Guido van Rossum пише: > >> Can someone explain to me why this is such a controversial issue? >> >> It seems reasonable to me to add new encodings to the stdlib that do the >> roundtripping requested in the first message of the thread. As long as they >> have new names that seems to fall under "practicality beats purity". >> (Modifying existing encodings seems wrong -- did the feature request >> somehow transmogrify into that?) >> > > In any case you need to change your code. If add new error handler -- you > need to change the decoding code to use this error handler: > > text = data.decode(encoding, 'whatwgreplace') > > If add new encodings -- you need to support an alias table that maps > standard encoding names to corresponding names of WHATWG encoding: > > aliases = {'windows_1252': 'windows-1252-whatwg', > 'windows_1251': 'windows-1251-whatwg', > 'utf_8': 'utf-8-whatwg', # utf-8 + surrogatepass > ... > } > ... > text = data.decode(aliases.get(normalize_encoding(encoding), > encoding)) > > I don't see an advantage of the second approach for the end user. And of > course it is more costly for maintainers, because we will need to > implement around 20 new encodings, and adds a cognitive burden for new > Python users, which now have more tables of encodings in the documentation. > Hm. As a user, unless I run into problems with a specific encoding, I never care about how many encodings we have, so I don't see how adding extra encodings bothers those users who have no need for them. There's a reason to prefer new encoding names (maybe augmented with alias table) over a new error handler: there are lots of places where encodings are passed around via text files, Internet protocols, RPC calls, layers and layers of function calls. Many of these treat the encoding as a string, not as a (string, errorhandler) pair. So there may be situations where there is no way in a given API to preserve the need for using a special error handler, while the API would not have a problem preserving just the encoding name. -- --Guido van Rossum (python.org/~guido)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/