I found that the current FPC does have Unicode support, but there are some problems.

- WideStrings work fine with Unicode UCS-2 but they (of course) have similar issues as UTF8-Strings when surrogate codes are used (which is rarely necessary in Europe and America).

- FPC does not have a dedicated type "UTF8String", but the type defined as "UTF8String" is just the same as ANSIString and thus the compiler can't decide which is meant by the programmer and can't create the appropriate code when it's necessary to distinguish between them (e.g when it automatically should converting between locale-coded ANSIString, UTF8String and WideString)

- by design (for speed sake), UTF8String (and WideString when surrogate codes are used) count in subcodes and not in Unicode-Characters, so the behavior is "unexpected" when doing things like s[i], pos(s), copy(), delete(), ... There are not _slow_ functions that do the "expected" versions of s[i], pos(s), copy(), delete(), ... (I've yet to find out how I can print just the first character of an UTF8String :)

- there is no decent "character" type for UTF8 or UTF16 coded Characters (WideChar (UCS2 code) works if no surrogate codes are used.)

- there are different option on how the compiler expects the coding of the source file. Seemingly if it detects it to be UTF8 coded and a certain (otherwise correct) option is set, even "s := 'hallo äöü'; " does not work correctly as expected if s is a WideString. (Lazarus with default settings suffers from this problem).

-Michael
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to