On Thu, Oct 20, 2011 at 2:54 PM, Michael Schnell <[email protected]> wrote: > And thus functions like pos(), length() and myString[i] work on UTF-8 code > bytes rather than on (displayed) characters.
Characters can be composed by separate codepoints for accent + character (so at least 4 bytes in UTF-16). So if you write code which depends on [] indexing characters your code will fail miserably in this case. Mac OS X uses the decomposed form in UTF-8 to store filenames, which is rather unpleasant. If you convert this to UTF-16 for further work the text will not magically get composed, although one could pass it through a composing pre-processor. -- Felipe Monteiro de Carvalho -- _______________________________________________ Lazarus mailing list [email protected] http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
