In our previous episode, Hans-Peter Diettrich said: > > Lazarus was forced to make out of the identity of ANSIString and > > UTF8String seemingly forced by FPC. e.g.: > > > > Old programs assuming local ANSI 8 bit code retrieved from LCL GUI > > components, compiled with the new version don't work (e.g. if doing > > myChar := myString[3]; ) > > How many bytes must a char have, when it shall allow to store any > (logical) character?
According to unicode n*codepoints. A codepoint is now 20 or 21 bits, but can be expanded in the more distant future IIRC n is in the range of 5-8 or so, the maximum amount of codepoints that can be combined to a printable character. So if you want to do it up to spec, a character is +/- 256bit. > Unicode users have no use for an char type, instead they have to use > substrings for every logical character. A Unicode BMP user could be happy > with a 2-byte char, of course, at his own (low) risk. Probably. But while a good point for a application builder based in the West, it is IMHO not acceptable to cut corners in the unicode implementation in system and development tools. _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel