Op Fri, 26 Sep 2008, schreef Marco van de Voort:
In our previous episode, Dani?l Mantione said:
as I know D2009 (I think) handles this correctly, but I have no idea
how.
Let me put it like this: Someone writing a Russian/Arabic/Japanese spell
checker does not have to handle surrogates with UTF-16, but he does with
UTF-8, i.e. UTF-16 is much better for them than UTF-8.
Are you sure? There is a CJK plane above $FFFF.
Chinese yes, Japanese is fully BMP.
Afaik these are non
simplified glyphs used for titles etc. Less than normal script, but not that
rare.
Someone writing a spell checker for old-Egyptian Hieroglyphs will have to
deal with surrogates. For those people UTF-16 has few advantages over
UTF-8, (allthough in practice it's still a bit easier to handle than UTF-8).
IMHO such assumptions can be made for end user businesscode. (and only if the
CJK
pages above $FFFF are ancient and not in modern use), however the RTL and
other libraries should be simply unicode complaint. Period.
Yes.
Daniël
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel