On Mon, Dec 8, 2008 at 2:44 PM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > On 2008-12-08 22:32, Adam Olsen wrote: >> On Mon, Dec 8, 2008 at 2:01 PM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: >>> On 2008-12-08 21:45, Antoine Pitrou wrote: >>>> M.-A. Lemburg <mal <at> egenix.com> writes: >>>>> Such application specific error handlers could then also apply >>>>> whatever fancy round-trip safe encoding of non-decodable bytes >>>>> to Unicode escapes, private code points, etc. as seen fit by the >>>>> application. >>>> I'd argue that such fancy round-trip safe error handler should be provided >>>> by >>>> Python. It's not reasonable to expect application coders to come up with >>>> their >>>> own codec variation based on subtle details of the unicode spec. >>> Fair enough. We could add some e.g. >>> >>> * a round-trip safe escape error handler that uses a Unicode private >>> code point area which we officially reserve for the Python >>> interpreter >> >> This would of course alter the behaviour of those private code points, >> preventing them from round-tripping properly. >> >> I don't think round-tripping can be done from an error handler. You >> need a full codec to do it. A simple option is 8859-1. Or, ya know, >> bytes. This has long since gotten repetitive.. > > The error handler would just map the problem bytes to the private > area. The application would then have to decide what to do with > them, ie. the error handler only provides one half of the round- > tripping.
By that point it's already too late. You've already conflated garbage PUA with legitimate PUA. To make it work you need to treat those legitimate PUA scalars as errors too, transforming them. A common example is how escaping replaces a single '\' with '\\'. Hrm. nul-escaping should work. Obviously it can't be used outside the filesystem though, as they may introduce a legitimate nul. -- Adam Olsen, aka Rhamphoryncus _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com