> If the bytes are mapped to single half surrogate codes instead of the > normal pairs (low+high), then I can see that decoding could never be > ambiguous and encoding could produce the original bytes.
I was confused by Markus Kuhn's original UTF-8b specification. I have now changed the PEP to avoid using PUA characters at all. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com