Besides, in current implementation UTF8 might have a disadvantage with 2-byte+
encodings. Those encodings are in WideString format, and conversion to old
string can be done either automatically or via special procedures (as it seems
to be on kylix). UTF8 is implemented as a string. It has some advantages (easy
works), but one big disadvantage. Working with 1-bit encoding strings everyone
assumes it is one byte.
Let's imagine that an old pascal/delphi program hardly works with Russian
words. It assumes that the length (number of letters) of a word contained in
string can be obtained by length function. Besides, it can use fixed lengths in
copy function and so on.
When this software will work with widestrings, in simple situation the
widestring will be autoconverted to ansistring. In more complex situations the
length of widestring will be calculated as a number of widechars contained,
which is right too. Current UTF8 string is a "type string", and its length is
currently a number of bytes...
_________________________________________________________________
To unsubscribe: mail [EMAIL PROTECTED] with
"unsubscribe" as the Subject
archives at http://www.lazarus.freepascal.org/mailarchives