Hi Oliver, I've created a small benchmark too. It takes Leo Tolstoy's "War and Peace" Book One as input and converts it from Russian CP-1251 to UTF-16 (10 times) and back (also 10 times). You may find the benchmark's source code and a build file at [1]. The first difference from your benchmark is the language & encoding - Russian in my case. The second difference is the set of tested VMs - I've run the benchmark on RI, J9 and DLRVM.
You may find results below. BTW the results shows that in this particular test our internal providers (from org.apache.harmony.niochar.charset package) are faster than both versions of ICU. Another interesting fact is terrible ICU performance on DLRVM. However, on J9 it works rather fast. And this is something that should be fixed IMO (bad performance on DRLVM I mean). And finally, yes, ICU4JNI is a little bit faster than ICU4J in this test. However, "War and Peace" is a rather big book (paper version of the first part contains about 400 pages, if repeated 10 times = 4000 pages), but difference in numbers is not so big. [1] http://people.apache.org/~ayza/icu_experiments/ RI --- Built-in <sun.nio.cs.MS1251$Decoder> Decoding time: 571 millis <sun.nio.cs.MS1251$Encoder> Encoding time: 351 millis ICU4j <com.ibm.icu.charset.CharsetMBCS$CharsetDecoderMBCS> Decoding time: 430 millis <com.ibm.icu.charset.CharsetMBCS$CharsetEncoderMBCS> Encoding time: 551 millis ICU4JNI <com.ibm.icu.charset.CharsetMBCS$CharsetDecoderMBCS> Decoding time: 401 millis <com.ibm.icu.charset.CharsetMBCS$CharsetEncoderMBCS> Encoding time: 540 millis J9 --- Built-in <org.apache.harmony.niochar.charset.CP_1251$Decoder> Decoding time: 231 millis <org.apache.harmony.niochar.charset.CP_1251$Encoder> Encoding time: 430 millis ICU4j <com.ibm.icu.charset.CharsetMBCS$CharsetDecoderMBCS> Decoding time: 781 millis <com.ibm.icu.charset.CharsetMBCS$CharsetEncoderMBCS> Encoding time: 620 millis ICU4JNI <com.ibm.icu.charset.CharsetMBCS$CharsetDecoderMBCS> Decoding time: 561 millis <com.ibm.icu.charset.CharsetMBCS$CharsetEncoderMBCS> Encoding time: 371 millis DRLVM --- Built-in <org.apache.harmony.niochar.charset.CP_1251$Decoder> Decoding time: 351 millis <org.apache.harmony.niochar.charset.CP_1251$Encoder> Encoding time: 540 millis ICU4j <com.ibm.icu.charset.CharsetMBCS$CharsetDecoderMBCS> Decoding time: 6660 millis <com.ibm.icu.charset.CharsetMBCS$CharsetEncoderMBCS> Encoding time: 1071 millis ICU4JNI <com.ibm.icu.charset.CharsetMBCS$CharsetDecoderMBCS> Decoding time: 6179 millis <com.ibm.icu.charset.CharsetMBCS$CharsetEncoderMBCS> Encoding time: 451 millis With Best Regards, Alexei 2007/10/11, Oliver Deakin <[EMAIL PROTECTED]>: > Tony Wu wrote: > > On 10/8/07, Oliver Deakin <[EMAIL PROTECTED]> wrote: > >> Are there any particular > >> benchmarks you had in mind for this? > >> > >> > > ya, there is a micro benchmark on HARMONY-3709 > > > > > <SNIP!> > > I have run the micro benchmark on Harmony with it's current ICU > configuration (icu4jni 3.4.4) and on Harmony with pure icu4j 3.8. The > results are pretty much as expected - for small jobs icu4j is > significantly faster, for large jobs icu4jni comes out on top (full > results at the end of this email). It seems that performance-wise there > are benefits on both sides depending on the work we are doing. > > Regards, > Oliver
