Graeme Geldenhuys schrieb:

   {$IFDEF WINDOWS}
      UnicodeString = type AnsiString(CP_UTF16);

AnsiStrings consist of bytes only, for good reasons (mostly performance). The Delphi developers wanted to implement what you suggest, but dropped that approach later again.

String classes have the same performance problems, so that e.g. in .NET it's suggested to use functions instead of string operators. In Delphi and FPC compiler magic is used instead of classes.

   {$ELSE}
      // probably not strictly correct, but assuming *nix here. But
you get the idea
      UnicodeString = type AnsiString(CP_UTF8);
   {$ENDIF

   String = type UnicodeString;
   Char = type String[4];   // the maximum size of a Unicode codepoint
is 4 bytes

A character type is somewhat useless, unless all strings are UTF-32 (what's quite unlikely now). Instead substrings should be used, which can contain any number of bytes or characters.

You also have to explain what String[4] means in an Unicode environment. The ShortString type does not have an encoding, and thus is deprecated in a Unicode environment.

Q: Did you ever read about the new string implementation of FPC?
Do you really want to reinvent the wheel, in another incompatible way?

DoDi

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to