On Fri, 26 Sep 2008 13:20:57 +0200 Michael Schnell <[EMAIL PROTECTED]> wrote:
> Nonetheless a type to hold a single character needs to exist. And > same needs to be a 32 bit type if you want to store more than 2^16 > different values (as possible with UTF-8 and UTF-16 but not with > UCS-2. Some characters are encoded as several unicode characters. For example a german a-umlaut is encoded under Mac OS X HFS as 2 characters = 1+2bytes in UTF-8 and 2+2bytes in UTF-16. This is not some Egyptian or Klingon, but normal German, Finnish, French, etc. A s[i]:='x' doesn't work in UTF-8, nor UTF-16, nor UTF-32. In short: A single character for all purposes can not be defined. Unicode can not be handled as array of character. The choice for UTF-8 or UTF-16 depends mostly on the used libraries and compatibility. The more unicode features you want to support the less important becomes the encoding. The encoding can be important for speed: For example the widestring xml parser is up to 10 times slower than the ansistring xml parser. Mattias _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel