From: "Tom Emerson" <[EMAIL PROTECTED]>

> But if I have a text string, and that string is encoded in UTF-16, and
> I want to access Unicode character values, then I cannot index that
> string in constant time.
>
> To find character n I have to walk all of the 16-bit values in that
> string accounting for surrogates. If I use UTF-32 I don't need to do
> that. This very issue came up during the discussion of how to handle
> surrogates in Python.

Would this not be the same issue for composite characters, even *in* UTF-32?
If you truly mean to work with characters here then it seems this is a
problem you can always have.


MichKa

Michael Kaplan
Trigeminal Software, Inc.
http://www.trigeminal.com/



Reply via email to