> wrote: > o new UTF8String class (untested)
> 
> If this is part of the new unicodization to support
> full-unicode, there's some stuff we need to discuss.

Wasn't intended as such. phearbear says QNX wants to use UTF-8 whereas
Abi uses UCS-2 and I decided to write the UTF8String class to facilitate
the conversion. Strings are stored internally as UTF-8 byte sequences,
and there is a home-made iterator for accessing the string sequence by
sequence; and a fn. for converting current sequence to UCS-4.

Currently conversion to UTF-8 is only from UCS-2, but conversion from
UCS-4 would be a trivial change. (I'm assuming that UCS-2 is the first
65536 codes of UCS-4 - is this correct?)

As a string class it's not nearly as functional as the others, but it's
not really intended as a replacement.

> We need to design the system so that a string is not
> built from a series of UTF-8 (or UTF-32) characters
> directly, but a series of "composed character" which
> in turn are a series of UTF-8 characters, the first
> being the main character, the remainder being zero-
> width modifiers.  We need this to support proper
> internationalization.  We probably need much
> discussion first actually.

Not sure I understand this. Can you explain how to use zero-width
modifiers?

Frank

Francis James Franklin
[EMAIL PROTECTED]

"No, she really likes me. She told me I look like Britney Spears, and why
would you say that to somebody you don't like?"
                                                           --- Elle Woods



Reply via email to