> "correct" -> "corrected" Thanks, fixed.
>> To convert non-decodable bytes, a new error handler "python-escape" is >> introduced, which decodes non-decodable bytes using into a private-use >> character U+F01xx, which is believed to not conflict with private-use >> characters that currently exist in Python codecs. > > Would this mean that real private use characters in the file name would > raise an exception? How? The UTF-8 decoder doesn't pass those bytes to > any error handler. The python-escape codec is only used/meaningful if the env encoding is not UTF-8. For any other encoding, it is assumed that no character actually maps to the private-use characters. >> The error handler interface is extended to allow the encode error >> handler to return byte strings immediately, in addition to returning >> Unicode strings which then get encoded again. > > Then the error callback for encoding would become specific to the target > encoding. Why would it become specific? It can work the same way for any encoding: take U+F01xx, and generate the byte xx. >> If the locale's encoding is UTF-8, the file system encoding is set to >> a new encoding "utf-8b". The UTF-8b codec decodes non-decodable bytes >> (which must be >= 0x80) into half surrogate codes U+DC80..U+DCFF. > > Is this done by the codec, or the error handler? If it's done by the > codec I don't see a reason for the "python-escape" error handler. utf-8b is a new codec. However, the utf-8b codec is only used if the env encoding would otherwise be utf-8. For utf-8b, the error handler is indeed unnecessary. >> While providing a uniform API to non-decodable bytes, this interface >> has the limitation that chosen representation only "works" if the data >> get converted back to bytes with the python-escape error handler >> also. > > I thought the error handler would be used for decoding. It's used in both directions: for decoding, it converts \xXX to U+F01XX. For encoding, U+F01XX will trigger an error, which is then handled by the handler to produce \xXX. > "and" -> "an" Thanks, fixed. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com