On Mon, Apr 9, 2012 at 8:17 PM, Hans-Peter Diettrich <[email protected]> wrote: > Marcos Douglas schrieb: > > >> I still think about: >> DirectoryExists or DirectoryExistsUTF8 >> ForceDirectoriesUTF8 or ForceDirectories >> Pos or UTF8Pos >> etc >> >> Depends what part of code you are... > > > Such problems may (should) go away with the new Unicode- and AnsiString > types, where AnsiString contains an Encoding field. Then the conversion > between UTF-8 and the system codepage are done automatically, whenever > required, and the xyUTF8 functions can be dropped then. > > I discourage the use of UTF8Pos, in detail together with the new (encoded) > AnsiString type. When such a string is auto-converted, for some reason, the > index returned by UTF8Pos will become invalid. This is one of the downsides > of encoded strings, which suggest to use UnicodeString in future code. > Delphi enforced that move, by changing String and Char to UnicodeString and > WideChar, and Delphi compatibility propagated that pressure into FPC. The > continued use of UTF-8 strings (AnsiString) will result in a speed and > memory usage penalty, unless the system codepage is UTF-8. If your code only > contains String type strings, not AnsiString or UTF8String, then all your > strings will become UnicodeStrings (UTF-16), for which the xyUTF8 functions > are either inapplicable or will result only in superfluous implicit string > conversions. > > Now every user has the choice to stay with a specific FPC/Lazarus version, > that does not yet support the new string types, or to drop UTF-8 strings in > favor of the new UTF-16 strings. Since most code has to deal with the > Unicode BMP (BasicMappingPage) only, the difference between the length of an > string in (UTF-8)chars and characters has gone away with UTF-16. Do you > really see a need for finding the position of a non-BMP character in an > string, and changing exactly that character in the string? Then you are on > the safe side by using StringReplace, which already worked with UTF-8 and > will continue to work with UTF-16 and whatever other encoding. The use of > Char variables has been dangerous already with UTF-8, where exotic > ("astral") characters can consist of up to 6 bytes. In so far I don't > understand why Delphi now uses WideChar for Char, instead of UnicodeChar, > where it is guaranteed that every codepoint (except ligatures and similar > text-processing stuff) can be stored in a UnicodeChar variable.
When the new Unicode and AnsiString types (that contains an Encoding field) arrive to us, users of FPC 2.6.1? Is this done? Marcos Douglas -- _______________________________________________ Lazarus mailing list [email protected] http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
