On 02/17/2011 10:35 AM, Graeme Geldenhuys wrote:
You can't use FPC's Copy(), Pos() etc reliably with
UTF-8 text, because thouse RTL functions work purely on ANSI text
(1-byte characters - speaking of String type text here) and don't know
about multi-byte characters,
Thats the "magic" :-) . pos() does finds the correct multi-byte characters.
combining diacritics etc.
This of course does not work, as theses "Unicode quirks" (this name was
not introduced by me !) forces that the same visual character can be
encoded in different ways. I don't know if it is even possible (and
sensible) to support this at language-level.
-Michael
--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus