Re: [Python-3000] How will unicode get used?

David Hopwood Thu, 21 Sep 2006 12:53:51 -0700

Fredrik Lundh wrote:
> David Hopwood wrote:
> 
>>For example, "ö" can be represented either as the precomposed character 
>>U+00F6,
>>or as "o" followed by a combining diaeresis (U+006F U+0308).
> 
> normalization is a good thing, though:
> 
>      http://www.w3.org/TR/charmod-norm/
> 
> (it would probably be a good idea to turn unicodedata.normalize into a 
> method for the new unicode string type).


Normalization is certainly a good thing to support. But that's orthogonal to
my point above -- that some abstract characters are representable by sequences
of more than one code point, which must not be split, and that avoidance of such
splitting automatically also avoids splitting within a code point 
representation.

Note that some abstract characters needed for living languages are representable
*only* by combining sequences.

-- 
David Hopwood <[EMAIL PROTECTED]>



_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Re: [Python-3000] How will unicode get used?

Reply via email to