Re: [fpc-devel] Unicode RTL

Tomas Hajny Wed, 16 Nov 2005 08:14:33 -0800

Marco van de Voort napsal(a):
>> >> >
>> >> > ... has a different implementation for utf-8 and 8-bit code pages.
>> >>
>> >> Why? With utf-8 a string is searched, with 8-bit cp one char. No
>> other
>> >> char/sequence of char other than ? can generate the byte sequence
>> >> representing ?
>> >
>> > const s : 'Dani?l';
>> >
>> > var accent : utf8char;
>> >
>> > x:=pos('i','Dani?l');
>> > accent:=s[x+1];
>>
>> We could have special support for assignment to type utf8char, couldn't
>> we?
>
> It would be horribly slow, since this would apply to length too, and think
> of
> while i<length(x) do  inc(i);  like constructs.
>
> I think the avg delphi code simply assumes 100% that chars are fixed
> width.


I'm afraid that you don't get too far with that assumption. "Existing
Delphi code" most probably isn't DBCS/MBCS safe.

Regarding constructs like "while i<length(x) do" - I'd say that most
common use of these are comparison, copying, translation to
uppercase/lowercase and combinations of these. All these operations should
be performed using dedicated (RTL) functions, otherwise they will fail in
DBCS/MBCS environment anyway (or at least result in suboptimal
implementation).

Tomas

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Unicode RTL

Reply via email to