On 02/17/2011 10:35 AM, Graeme Geldenhuys wrote:

You can't use FPC's Copy(), Pos() etc reliably with
UTF-8 text, because thouse RTL functions work purely on ANSI text
(1-byte characters - speaking of String type text here) and don't know
about multi-byte characters,
Thats the "magic" :-) . pos() does finds the correct multi-byte characters.
combining diacritics etc.
This of course does not work, as theses "Unicode quirks" (this name was not introduced by me !) forces that the same visual character can be encoded in different ways. I don't know if it is even possible (and sensible) to support this at language-level.

-Michael

--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Reply via email to