En/na Mattias Gärtner ha escrit:

For most string operations, like computing the byte length or comparing strings
ASCII case insensitive, UTF-8 is 100% compatible.

but not if you need char length, say limiting a text to 40 characters and indicating there that the text has been truncated with '..':


if length(s)>40 then s:=copy(s,1,38)+'..';

or maybe faster

if length(s)>40 then
begin
  s[39]:='.';
  s[40]:='.';
  setlength(s,40);
end;

would break with utf-8 (and with utf-16 too if you use characters outside the bmp). There are probably utf-8 equivalents of the above, but old habits die hard.... Maybe for internal processing utf-32 is better and only use utf-8 for input/output and/or interface with other systems?

Bye
--
Luca Olivetti
Wetron Automatización S.A. http://www.wetron.es/
Tel. +34 93 5883004      Fax +34 93 5883007

_________________________________________________________________
    To unsubscribe: mail [EMAIL PROTECTED] with
               "unsubscribe" as the Subject
  archives at http://www.lazarus.freepascal.org/mailarchives

Reply via email to