As Marco van de Voort requested me to reuse the large functionality of charset
(see bugtracker comment) I have enlarged my test-application. Here are the
results :
...
ISO-8859-1 >> UTF-8 using LConvEncoding ¦ Input string has 256 characters.
---------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,iso88591,utf8):string
100000 times, Time: 0,312 [s] : Result is correct.
Evaluating LConvEncoding.ISO_8859_1ToUTF8(string):string
100000 times, Time: 0,249 [s] : Result is correct.
ISO-8859-1 >> UTF-8 using Charset ¦ Input string has 256 characters.
---------------------------------------------------------------------
Charset does not support conversions to UTF8, using utf8-unit for that
Evaluating utf8.UnicodeToUTF8(Charset.getunicode(string,iso88591)):string
100000 times, Time: 2,480 [s] : Result is correct.
ISO-8859-1 >> UTF-8 using Codepages ¦ Input string has 256 characters.
-----------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-1,chEncUTF-8):string
100000 times, Time: 0,187 [s] : Result is correct.
Evaluating ConvertToUTF8(string,chEncISO-8859-1):string
100000 times, Time: 0,234 [s] : Result is correct.
...
ISO-8859-1 >> UTF-16 using Charset ¦ Input string has 256 characters.
----------------------------------------------------------------------
Charset does not support conversions to UTF16, using utf16-unit for that
Evaluating utf8.UnicodeToUTF16(Charset.getunicode(string,iso88591)):widestring
100000 times, Time: 7,847 [s] : Result is correct.
ISO-8859-1 >> UTF-16 using Codepages ¦ Input string has 256 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-1,chEncUTF-16):widestring
100000 times, Time: 0,203 [s] : Result is correct.
ISO-8859-2 >> UTF-16 using Charset ¦ Input string has 256 characters.
----------------------------------------------------------------------
Charset does not support conversions to UTF16, using utf16-unit for that
Evaluating utf8.UnicodeToUTF16(Charset.getunicode(string,iso88592)):widestring
100000 times, Time: 7,831 [s] : Result is correct.
ISO-8859-2 >> UTF-16 using Codepages ¦ Input string has 256 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-2,chEncUTF-16):widestring
100000 times, Time: 0,219 [s] : Result is correct.
....
ISO-8859-1 >> ISO-8859-2 using LConvEncoding ¦ Input string has 256 characters.
--------------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,iso88591,iso88592):string
100000 times, Time: 0,873 [s]
ISO-8859-1 >> ISO-8859-2 using Charset ¦ Input string has 256 characters.
----------------------------------------------------------------------------
Evaluating
Charset.getascii(Charset.getunicode(string,iso88591),iso88592):string
100000 times, Time: 9,079 [s]
ISO-8859-1 >> ISO-8859-2 using Codepages ¦ Input string has 256 characters.
----------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-1,chEncISO-8859-2):string
100000 times, Time: 0,218 [s]
....
SHIFT_JIS >> UTF-8 using LConvEncoding ¦ Input string has 14843 characters.
----------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,cp932,utf8):string
1000 times, Time: 24,321 [s]
Length(Result)=22078 Length(Reference)=22173 : 79 characters are different.
Evaluating LConvEncoding.CP932ToUTF8(string):string
1000 times, Time: 24,414 [s]
Length(Result)=22078 Length(Reference)=22173 : 79 characters are different.
SHIFT_JIS >> UTF-8 using Charset ¦ Input string has 14843 characters.
----------------------------------------------------------------------
Charset does not support conversions to UTF8, using utf8-unit for that
Evaluating utf8.UnicodeToUTF8(Charset.getunicode(string,cp932)):string
1000 times, Time: 1,560 [s]
Length(Result)=39233 Length(Reference)=22173 : 21798 characters are different.
SHIFT_JIS >> UTF-8 using Codepages ¦ Input string has 14843 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncCP932,chEncUTF-8):string
1000 times, Time: 0,234 [s] : Result is correct.
Evaluating ConvertToUTF8(string,chEncCP932):string
1000 times, Time: 0,218 [s] : Result is correct.
Evaluating CP932ToUTF8(string):string
1000 times, Time: 0,218 [s] : Result is correct.
Hmmm, the conversion SHIFT_JIS >> UTF-8 using the Charset-unit ended up with a
complet mess. The reason is, that the large functionality of charset has no
mean to convert Doublebyte charsets to Unicode. :(
The complete Testresults in the attachment...
I will publish the Testprogram on the bugtracker.
Greetings
______________________________________________________
powered by GLOBER.LU
Luxembourg Internet Service Provider
Hosting. Domain Registration, Webshops, Webdesign, FreeMail ...
Our professional Web Hosting plans include all the features you are looking for
at the best possible price.
www.globe.lu
OS Name Microsoft Windows 7 Home Premium
Version 6.1.7600 Build 7600
System Manufacturer Gigabyte Technology Co., Ltd.
System Model GA-870A-UD3
System Type x64-based PC
Processor AMD Phenom(tm) II X6 1090T Processor, 3200 Mhz, 6
Core(s), 6 Logical Processor(s)
BIOS Version/Date Award Software International, Inc. F1, 15.04.2010
Total Physical Memory 8,00 GB
Total Virtual Memory 14,0 GB
Testing character conversion.
System is little endian
Using half translation tables.
Using UTF8-translation tables.
ISO-8859-1 >> UTF-8 using LConvEncoding ¦ Input string has 256 characters.
---------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,iso88591,utf8):string
100000 times, Time: 0,312 [s] : Result is correct.
Evaluating LConvEncoding.ISO_8859_1ToUTF8(string):string
100000 times, Time: 0,249 [s] : Result is correct.
ISO-8859-1 >> UTF-8 using Charset ¦ Input string has 256 characters.
---------------------------------------------------------------------
Charset does not support conversions to UTF8, using utf8-unit for that
Evaluating utf8.UnicodeToUTF8(Charset.getunicode(string,iso88591)):string
100000 times, Time: 2,480 [s] : Result is correct.
ISO-8859-1 >> UTF-8 using Codepages ¦ Input string has 256 characters.
-----------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-1,chEncUTF-8):string
100000 times, Time: 0,187 [s] : Result is correct.
Evaluating ConvertToUTF8(string,chEncISO-8859-1):string
100000 times, Time: 0,234 [s] : Result is correct.
Evaluating ISO_8859_1ToUTF8(string):string
100000 times, Time: 0,234 [s] : Result is correct.
Evaluating DirectConversion(string,chEncISO-8859-1,chEncUTF-8):widestring
100000 times, Time: 0,250 [s] : Result is correct.
Evaluating DirectConversion(string,chEncISO-8859-1,chEncUTF-16):widestring
100000 times, Time: 0,202 [s] : Result is correct.
ISO-8859-1 >> UTF-16 using Charset ¦ Input string has 256 characters.
----------------------------------------------------------------------
Charset does not support conversions to UTF16, using utf16-unit for that
Evaluating utf8.UnicodeToUTF16(Charset.getunicode(string,iso88591)):widestring
100000 times, Time: 7,847 [s] : Result is correct.
ISO-8859-1 >> UTF-16 using Codepages ¦ Input string has 256 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-1,chEncUTF-16):widestring
100000 times, Time: 0,203 [s] : Result is correct.
ISO-8859-2 >> UTF-16 using Charset ¦ Input string has 256 characters.
----------------------------------------------------------------------
Charset does not support conversions to UTF16, using utf16-unit for that
Evaluating utf8.UnicodeToUTF16(Charset.getunicode(string,iso88592)):widestring
100000 times, Time: 7,831 [s] : Result is correct.
ISO-8859-2 >> UTF-16 using Codepages ¦ Input string has 256 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-2,chEncUTF-16):widestring
100000 times, Time: 0,219 [s] : Result is correct.
ISO-8859-2 >> UTF-8 using LConvEncoding ¦ Input string has 256 characters.
---------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,iso88592,utf8):string
100000 times, Time: 0,297 [s] : Result is correct.
Evaluating LConvEncoding.ISO_8859_2ToUTF8(string):string
100000 times, Time: 0,250 [s] : Result is correct.
ISO-8859-2 >> UTF-8 using Codepages ¦ Input string has 256 characters.
-----------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-2,chEncUTF-8):string
100000 times, Time: 0,203 [s] : Result is correct.
Evaluating ConvertToUTF8(string,chEncISO-8859-2):string
100000 times, Time: 0,234 [s] : Result is correct.
Evaluating ISO_8859_2ToUTF8(string):string
100000 times, Time: 0,234 [s] : Result is correct.
Evaluating DirectConversion(string,chEncISO-8859-2,chEncUTF-8):widestring
100000 times, Time: 0,281 [s] : Result is correct.
Evaluating DirectConversion(string,chEncISO-8859-2,chEncUTF-16):widestring
100000 times, Time: 0,203 [s] : Result is correct.
ISO-8859-1 >> ISO-8859-2 using LConvEncoding ¦ Input string has 256 characters.
--------------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,iso88591,iso88592):string
100000 times, Time: 0,873 [s]
ISO-8859-1 >> ISO-8859-2 using Codepages ¦ Input string has 256 characters.
----------------------------------------------------------------------------
Evaluating
Charset.getascii(Charset.getunicode(string,iso88591),iso88592):string
100000 times, Time: 9,079 [s]
ISO-8859-1 >> ISO-8859-2 using Codepages ¦ Input string has 256 characters.
----------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-1,chEncISO-8859-2):string
100000 times, Time: 0,218 [s]
UTF-16 >> UTF-16BE using Codepages ¦ Input string has 256 characters.
----------------------------------------------------------------------
Evaluating DirectConversion(widestring,chEncUTF-16,chEncUTF-16BE):widestring
100000 times, Time: 0,234 [s] : Result is correct.
UTF-16 >> UTF-8 using Codepages ¦ Input string has 256 characters.
-------------------------------------------------------------------
Evaluating DirectConversion(widestring,chEncUTF-16,chEncUTF-8):string
100000 times, Time: 0,203 [s] : Result is correct.
SHIFT_JIS >> UTF-8 using LConvEncoding ¦ Input string has 14843 characters.
----------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,cp932,utf8):string
1000 times, Time: 24,321 [s]
Length(Result)=22078 Length(Reference)=22173 : 79 characters are different.
Evaluating LConvEncoding.CP932ToUTF8(string):string
1000 times, Time: 24,414 [s]
Length(Result)=22078 Length(Reference)=22173 : 79 characters are different.
SHIFT_JIS >> UTF-8 using Charset ¦ Input string has 14843 characters.
----------------------------------------------------------------------
Charset does not support conversions to UTF8, using utf8-unit for that
Evaluating utf8.UnicodeToUTF8(Charset.getunicode(string,cp932)):string
1000 times, Time: 1,560 [s]
Length(Result)=39233 Length(Reference)=22173 : 21798 characters are different.
SHIFT_JIS >> UTF-8 using Codepages ¦ Input string has 14843 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncCP932,chEncUTF-8):string
1000 times, Time: 0,234 [s] : Result is correct.
Evaluating ConvertToUTF8(string,chEncCP932):string
1000 times, Time: 0,218 [s] : Result is correct.
Evaluating CP932ToUTF8(string):string
1000 times, Time: 0,218 [s] : Result is correct.
UTF-8 >> SHIFT_JIS using LConvEncoding ¦ Input string has 22173 characters.
----------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,utf8,cp932):string
1000 times, Time: 24,367 [s] : 370 characters are
different.
Evaluating LConvEncoding.UTF8ToCp932(string):string
1000 times, Time: 24,399 [s] : 370 characters are
different.
UTF-8 >> SHIFT_JIS using Codepages ¦ Input string has 22173 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncUTF-8,chEncCP932):string
1000 times, Time: 0,328 [s] : Result is correct.
Evaluating UTF8ToCp932(string):string
1000 times, Time: 0,344 [s] : Result is correct.--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus