Re: [lazarus] UTF-8 vs UTF-16 support

Mattias Gärtner Mon, 08 Oct 2007 09:55:40 -0700

Zitat von Luca Olivetti <[EMAIL PROTECTED]>:

> En/na Mattias Gärtner ha escrit:
>
> > For most string operations, like computing the byte length or comparing
> strings
> > ASCII case insensitive, UTF-8 is 100% compatible.
>
> but not if you need char length, say limiting a text to 40 characters
> and indicating there that the text has been truncated with '..':
>
>
> if length(s)>40 then s:=copy(s,1,38)+'..';
>
> or maybe faster
>
> if length(s)>40 then
> begin
>    s[39]:='.';
>    s[40]:='.';
>    setlength(s,40);
> end;
>
> would break with utf-8 (and with utf-16 too if you use characters
> outside the bmp). There are probably utf-8 equivalents of the above, but
> old habits die hard....


if UTF8Length(s)>40 then s:=UTF8Copy(s,1,38)+'..';


> Maybe for internal processing utf-32 is better and only use utf-8 for
> input/output and/or interface with other systems?

:)

Speed: Depends on what you do: UTF-8, UTF-16, UTF-32
Memory: UTF-8 or UTF-16.
Compatibility: UTF-8 (VCL)
Easy coding: UTF-32

There is no absolute winner.

Mattias

_________________________________________________________________
     To unsubscribe: mail [EMAIL PROTECTED] with
                "unsubscribe" as the Subject
   archives at http://www.lazarus.freepascal.org/mailarchives

Re: [lazarus] UTF-8 vs UTF-16 support

Reply via email to