Me:
>> 2. Although UTF-8 encoding allows for up to 32 bits per character by
>>    using a 6-byte encoding, both the Unicode Consortium and ISO have
>>    decided that they don't need that full range any more. IIRC they
>>    only need 21 bits to represent all characters. So this is probably
>>    the reason for the statement somewhere in the FLTK code that it
>>    only handles 24 bits.

Ian:
> Indeed so - though a 32-bit type is simplest to handle (internally),
> and means that the value can be the same as the Unicode code point...
>
> But then endianess becomes an issue at interfaces - hence the need
> for utf8, which is immune to endianess. Of course, with larger types
> (16 or 32 bit) we can use the BOM to identify the endian ordering of
> the text, but that is such a bodge...

>>    [It also says that only 16 bits are really needed for Linux and
>>    Windows, which fits with a limited 16-bit wchar_t implementation]

> Hmm, not convinced this is true - it is not uncommon to see utf16
> text with surrogate pairs in it (where the required code point does
> not fit in a utf16 entry and is split over 2 16-bit values) so that
> kind of implies that a 16-bit only implementation isn't going to
> work for us...

The [text] above is based on some comments in the code, so I assume
that Roman or O'ksi'D or Bill or someone had some insight/analysis to
back this up. I don't have the multi-language / script experience to
be able to judge.

As far as I can see, FLTK only needs to concentrate on how to display
UTF-8 characters at the moment. Anyone who is manipulating text with
composing characters, surrogates, bi-directional text, etc. should
really be using some other library, such as icu4c, for the bulk of
the work. Again, I have no experience of icu4c - I was just reading
the web pages - so have no idea if better alternatives are availale,
or if they are fast and light enough for FLTK to link to them.
Maybe that's an RFE for 1.4 or 3.1...

Cheers
D.


_______________________________________________
fltk-dev mailing list
[email protected]
http://lists.easysw.com/mailman/listinfo/fltk-dev

Reply via email to