[
https://issues.apache.org/jira/browse/AVRO-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849746#comment-13849746
]
Tie Liu commented on AVRO-1348:
-------------------------------
Just run the Perf test in our environment, which is 64 bit linux box. Our prod
is using a commercial jvm call Azul. I run the test on both java 1.6.0_25 which
is our dev version, and the azul jvm. Below is the comparison.
With java 1.6.0_25:
$ java -version
java version "1.6.0_25"
Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode)
Using CharSet:
test name time M entries/sec M bytes/sec bytes/cycle
StringRead: 6097 ms 6.560 233.663 1780910
StringWrite: 7410 ms 5.398 192.269 1780910
Using "UTF-8" string literal:
test name time M entries/sec M bytes/sec bytes/cycle
StringRead: 5504 ms 7.267 258.839 1780910
StringWrite: 7307 ms 5.474 194.980 1780910
Running with Azul:
$ /efs/dist/java/azuljdk/5.5.3.0/common/bin/java -version
java version "1.6.0_33"
Java(TM) SE Runtime Environment (build 1.6.0_33-b5)
Java HotSpot(TM) 64-Bit Tiered VM (build
1.6.0_33-ZVM_5.5.3.0-b5-product-azlinuxM-X86_64, mixed mode)
With CharSet:
test name time M entries/sec M bytes/sec bytes/cycle
StringRead: 8878 ms 4.505 160.469 1780910
StringWrite: 13078 ms 3.058 108.936 1780910
With "UTF-8" string literal:
test name time M entries/sec M bytes/sec bytes/cycle
StringRead: 6976 ms 5.733 204.213 1780910
StringWrite: 12829 ms 3.118 111.053 1780910
Our application is a trading application which handles 30k-40k message/sec at
peak time, so we are very careful about garbage collection. We are calling
Utf8.toString multiple times on each incoming/outgoing messages, the additional
garbage created by the toString method is very important for us to get rid of,
that's the biggest motivation for us to use the string literal instead of
Charset in this case.
> Improve Utf8 to String conversion
> ---------------------------------
>
> Key: AVRO-1348
> URL: https://issues.apache.org/jira/browse/AVRO-1348
> Project: Avro
> Issue Type: Bug
> Reporter: Mark Wagner
> Assignee: Mohammad Kamrul Islam
> Attachments: AVRO-1348v2.patch, AVRO1348v1.patch
>
>
> AVRO-1241 found that the existing method of creating Strings from Utf8 byte
> arrays could be made faster. The same method is being used in the
> Utf8.toString(), and could likely be sped up by doing the same thing.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)