Op Fri, 26 Sep 2008, schreef Graeme Geldenhuys:

On Thu, Sep 25, 2008 at 10:33 PM, Florian Klaempfl
<[EMAIL PROTECTED]> wrote:

Who says that? UTF-16 is simply chosen because it has features (supporting
all characters basically) ANSI doesn't?

Sorry, my message was unclear and I got somewhat mixed up between ANSI
and UTF-8. I meant the encoding type of String or UnicodeString being
UTF-16 instead of UTF-8.  The CodeGear newsgroups are full of people
saying that UTF-16 was chosen because they could call the 'W' api's
without needing a conversion.

My question is, has anybody actually seen the speed difference (actual
timing results) showing UTF-16 string calling 'W' api's compared to
UTF-8->UTF-16 and then calling the 'W' api's.  With today's computers,
I can't imagine that there would be a "significant speed loss" using
such conversions. The speed difference might be milliseconds, but
that's not really "significant speed loss" is it?

I think the main speed issue with UTF-8 is the speed of procedures like "val". A "val" which accepts both western and Arabic digits would be significantly more complex and therefore slower in UTF-8 than in UTF-16.

I suppose it would be viable doing timing results for saving text
files as well. After all, 99% of the time, text files are stored in
UTF-8. So in D2009 you would first have to convert UTF-16 to UTF-8 and
then save. And the opposite when reading, plus checking for the byte
order marker.  If you used UTF-8 for the String encoding no
conversions are required and no byte order marker checks needed.

For me the speed of input/output is less relevant, this is limited by disk speed anyway. It's the speed of processing that should be decisive.

Daniël
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to