Dnia 18-09-2007, Wt o godzinie 13:08 +0900, Stephen J. Turnbull napisał(a):
> > This is wrong: UTF-8 is specified for PUA. PUA is no special from the > > point of view of UTF-8. > > It is from the point of view of the Unicode standard, specifically v5. > Please see section 16.5, especially about the "corporate use subarea". It is not. 16.5 doesn't say anything about UTF-8, and UTF-8 is already specified for PUA. > > UTF-8 is defined for all Unicode scalar values, > > Sure, and what I propose is entirely compatible with the specification > of UTF-8 as a UTF, It is not. In UTF-8 '\ue650' is b'\xEE\x99\x90', in your proposal it might be encoded as a single byte. > > "C10. When a process interprets a code unit sequence which purports to > > be in a Unicode character encoding form, it shall treat ill-formed code > > unit sequences as an error condition and shall not interpret such > > sequences as characters." > > Yeah, that's the one. > > While I'm uncomfortable advocating the position that my proposal is > entirely compatible with C10, It is not. Elements of PUA are characters. > it is arguable that "mapping code units to > characters in private space" is not the same as "interpreting them as > characters". It's not the same, but interpreting as characters in PUA is obviously interpreting as characters. > chibi:MacPorts steve$ python -c 'import sys; print("%x" % ord(sys.argv[1]))' > $(printf "\ue650") > Traceback (most recent call last): > File "<string>", line 1, in ? > TypeError: ord() expected a character, but string of length 6 found I meant Python3 where sys.argv is a list of Unicode strings. It should work out of the box. Why length 6? "\ue650" encoded in UTF-8 has length 3. For an old discussion about using PUA to represent bytes undecodable as UTF-8, see http://www.mail-archive.com/[EMAIL PROTECTED]/ and subthreads with "roundtripping" in the subject. -- __("< Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/ _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com