Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

Baptiste Carvello Wed, 29 Apr 2009 01:44:58 -0700

Lino Mastrodomenico a écrit :


Only for the new utf-8b encoding (if Martin agrees), while the
existing utf-8 is fine as is (or at least waaay outside the scope of
this PEP).

This is questionable. This would have the consequence that \udcxx in a pythonstring would sometimes mean a surrogate, and sometimes mean raw bytes, dependingon the history of the string.

By contrast, if the new utf-8b codec would *supercede* the old one, \udcxx wouldalways mean raw bytes (at least on UCS-4 builds, where surrogates are unused).Thus ambiguity could be avoided.


Baptiste

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

Reply via email to