Op Fri, 26 Sep 2008, schreef Graeme Geldenhuys:
On Thu, Sep 25, 2008 at 10:33 PM, Florian Klaempfl
<[EMAIL PROTECTED]> wrote:
Who says that? UTF-16 is simply chosen because it has features (supporting
all characters basically) ANSI doesn't?
Sorry, my message was unclear and I got somewhat mixed up between ANSI
and UTF-8. I meant the encoding type of String or UnicodeString being
UTF-16 instead of UTF-8. The CodeGear newsgroups are full of people
saying that UTF-16 was chosen because they could call the 'W' api's
without needing a conversion.
My question is, has anybody actually seen the speed difference (actual
timing results) showing UTF-16 string calling 'W' api's compared to
UTF-8->UTF-16 and then calling the 'W' api's. With today's computers,
I can't imagine that there would be a "significant speed loss" using
such conversions. The speed difference might be milliseconds, but
that's not really "significant speed loss" is it?
I think the main speed issue with UTF-8 is the speed of procedures like
"val". A "val" which accepts both western and Arabic digits would be
significantly more complex and therefore slower in UTF-8 than in UTF-16.
I suppose it would be viable doing timing results for saving text
files as well. After all, 99% of the time, text files are stored in
UTF-8. So in D2009 you would first have to convert UTF-16 to UTF-8 and
then save. And the opposite when reading, plus checking for the byte
order marker. If you used UTF-8 for the String encoding no
conversions are required and no byte order marker checks needed.
For me the speed of input/output is less relevant, this is limited by disk
speed anyway. It's the speed of processing that should be decisive.
Daniël
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel