05.02.18 05:01, Nick Coghlan пише:
On 2 February 2018 at 16:52, Steven D'Aprano <st...@pearwood.info> wrote:
If it were my decision, I'd have these codecs raise a warning (not an
error) when used for encoding. But I guess some people will consider
that either going too far or not far enough :-)
Rob pointed out that one of the main use cases for these codecs is
when going "Oh, this was decoded with a WHATWG encoding, which isn't
right, so I need to re-encode it with that encoding, and then decode
it with the right encoding". So encoding is very much part of the
usage model: it's needed when you've received the data over a Unicode
based interface rather than a binary one.
Wasn't the "surrogateescape" error handler designed for this purpose?
WHATWG encodings solve the same problem that "surrogateescape", but
1) They use different range for representing unmapped characters.
2) Not all unmapped characters can be decoded, thus a decoding is lossy,
and a round-trip not always works.
Python-ideas mailing list
Code of Conduct: http://python.org/psf/codeofconduct/