On Fri, 1 Aug 2025 12:34:15 GMT, Brett Okken <d...@openjdk.org> wrote:

> As suggested on mailing list, when encoding latin1 bytes to utf-8, we can 
> count the leading positive bytes and in the case where there is a negative, 
> we can copy all the positive values to the target byte[] prior to processing 
> the remaining data 1 byte at a time.
> 
> https://mail.openjdk.org/pipermail/core-libs-dev/2025-July/149417.html

Benchmark on win64

Baseline:


Benchmark                           (charsetName)  Mode  Cnt      Score     
Error  Units
StringEncode.encodeAllMixed                 UTF-8  avgt   10  20067.519 ┬▒ 
528.152  ns/op
StringEncode.encodeAsciiLong                UTF-8  avgt   10  12115.389 ┬▒ 
307.491  ns/op
StringEncode.encodeAsciiShort               UTF-8  avgt   10     70.098 ┬▒   
1.696  ns/op
StringEncode.encodeLatin1LongEnd            UTF-8  avgt   10   1974.391 ┬▒ 
162.405  ns/op
StringEncode.encodeLatin1LongOnly           UTF-8  avgt   10    270.097 ┬▒  
13.840  ns/op
StringEncode.encodeLatin1LongStart          UTF-8  avgt   10   1876.366 ┬▒  
51.971  ns/op
StringEncode.encodeLatin1Mixed              UTF-8  avgt   10   4973.070 ┬▒ 
130.426  ns/op
StringEncode.encodeLatin1Short              UTF-8  avgt   10     96.227 ┬▒   
2.816  ns/op
StringEncode.encodeShortMixed               UTF-8  avgt   10    360.586 ┬▒   
8.691  ns/op
StringEncode.encodeUTF16LongEnd             UTF-8  avgt   10   1534.748 ┬▒  
34.584  ns/op
StringEncode.encodeUTF16LongOnly            UTF-8  avgt   10    528.919 ┬▒  
15.143  ns/op
StringEncode.encodeUTF16LongStart           UTF-8  avgt   10   2275.117 ┬▒  
50.152  ns/op
StringEncode.encodeUTF16Mixed               UTF-8  avgt   10   4398.943 ┬▒ 
116.607  ns/op
StringEncode.encodeUTF16Short               UTF-8  avgt   10    152.219 ┬▒   
8.677  ns/op



Patch:

Benchmark                           (charsetName)  Mode  Cnt      Score     
Error  Units
StringEncode.encodeAllMixed                 UTF-8  avgt   10  18876.056 ┬▒ 
330.644  ns/op
StringEncode.encodeAsciiLong                UTF-8  avgt   10  12040.590 ┬▒ 
165.905  ns/op
StringEncode.encodeAsciiShort               UTF-8  avgt   10     69.895 ┬▒   
0.318  ns/op
StringEncode.encodeLatin1LongEnd            UTF-8  avgt   10    574.455 ┬▒  
14.769  ns/op
StringEncode.encodeLatin1LongOnly           UTF-8  avgt   10    284.553 ┬▒   
1.886  ns/op
StringEncode.encodeLatin1LongStart          UTF-8  avgt   10   2230.789 ┬▒  
11.043  ns/op
StringEncode.encodeLatin1Mixed              UTF-8  avgt   10   3278.998 ┬▒  
96.779  ns/op
StringEncode.encodeLatin1Short              UTF-8  avgt   10     99.332 ┬▒   
1.977  ns/op
StringEncode.encodeShortMixed               UTF-8  avgt   10    378.183 ┬▒  
17.504  ns/op
StringEncode.encodeUTF16LongEnd             UTF-8  avgt   10   1531.960 ┬▒  
19.300  ns/op
StringEncode.encodeUTF16LongOnly            UTF-8  avgt   10    563.810 ┬▒   
4.811  ns/op
StringEncode.encodeUTF16LongStart           UTF-8  avgt   10   2270.970 ┬▒  
28.495  ns/op
StringEncode.encodeUTF16Mixed               UTF-8  avgt   10   4403.824 ┬▒  
60.338  ns/op
StringEncode.encodeUTF16Short               UTF-8  avgt   10    158.600 ┬▒   
2.044  ns/op

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26597#issuecomment-3144446972

Reply via email to