richardstartin edited a comment on pull request #7599:
URL: https://github.com/apache/pinot/pull/7599#issuecomment-947183264
This benchmark demonstrates the improvement on JDK11 and also that the
implementation was sensible on JDK8. It uses 4 different character types, to
generate all 4 possible lengths of UTF8 sequence:
```java
@State(Scope.Benchmark)
public class DecodingBenchmark {
@Param({"1", "10", "100", "1000"})
int length;
@Param({"a", "ß", "道", "\uD841\uDF0E"})
String unit;
private String data;
@Setup(Level.Trial)
public void setup() {
StringBuilder sb = new StringBuilder(length * unit.length());
for (int i = 0; i < length; i++) {
sb.append(unit);
}
this.data = sb.toString();
}
@Benchmark
public byte[] decodeUTF8Charset() {
return data.getBytes(StandardCharsets.UTF_8);
}
@Benchmark
public byte[] decodeUTF8CharsetName() {
try {
return data.getBytes("UTF-8");
} catch (UnsupportedEncodingException e) {
throw new RuntimeException(e);
}
}
}
```
JDK11: pay attention to both the avgt (better even for quite long strings)
and the normalised allocation rate (always the same)
```
Benchmark
(length) (unit) Mode Cnt Score Error Units
DecodingBenchmark.decodeUTF8Charset
1 a avgt 5 12.690 ± 0.290 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1 a avgt 5 24.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1 ß avgt 5 19.147 ± 0.128 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1 ß avgt 5 24.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1 道 avgt 5 15.301 ± 0.047 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1 道 avgt 5 24.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1 𠜎 avgt 5 30.449 ± 0.280 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1 𠜎 avgt 5 48.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
10 a avgt 5 13.916 ± 0.212 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
10 a avgt 5 32.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
10 ß avgt 5 27.862 ± 0.246 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
10 ß avgt 5 40.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
10 道 avgt 5 54.348 ± 1.026 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
10 道 avgt 5 48.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
10 𠜎 avgt 5 87.956 ± 0.600 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
10 𠜎 avgt 5 136.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
100 a avgt 5 17.833 ± 0.228 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
100 a avgt 5 120.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
100 ß avgt 5 162.337 ± 0.883 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
100 ß avgt 5 216.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
100 道 avgt 5 441.978 ± 2.089 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
100 道 avgt 5 320.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
100 𠜎 avgt 5 654.704 ± 2.691 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
100 𠜎 avgt 5 1032.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1000 a avgt 5 143.401 ± 2.833 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1000 a avgt 5 1016.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1000 ß avgt 5 1953.952 ± 18.583 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1000 ß avgt 5 2016.001 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1000 道 avgt 5 4340.205 ± 168.275 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1000 道 avgt 5 3016.002 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1000 𠜎 avgt 5 6668.428 ± 71.027 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1000 𠜎 avgt 5 10032.003 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1 a avgt 5 35.770 ± 0.594 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1 a avgt 5 24.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1 ß avgt 5 34.974 ± 0.464 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1 ß avgt 5 24.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1 道 avgt 5 34.873 ± 0.927 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1 道 avgt 5 24.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1 𠜎 avgt 5 49.799 ± 2.304 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1 𠜎 avgt 5 48.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
10 a avgt 5 38.018 ± 0.273 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
10 a avgt 5 32.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
10 ß avgt 5 51.948 ± 0.146 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
10 ß avgt 5 40.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
10 道 avgt 5 72.474 ± 0.840 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
10 道 avgt 5 48.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
10 𠜎 avgt 5 107.176 ± 0.275 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
10 𠜎 avgt 5 136.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
100 a avgt 5 39.540 ± 0.426 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
100 a avgt 5 120.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
100 ß avgt 5 184.251 ± 1.008 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
100 ß avgt 5 216.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
100 道 avgt 5 457.615 ± 0.749 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
100 道 avgt 5 320.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
100 𠜎 avgt 5 672.385 ± 17.598 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
100 𠜎 avgt 5 1032.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1000 a avgt 5 142.672 ± 2.752 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1000 a avgt 5 1016.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1000 ß avgt 5 1600.567 ± 7.064 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1000 ß avgt 5 2016.001 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1000 道 avgt 5 4388.180 ± 48.504 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1000 道 avgt 5 3016.002 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1000 𠜎 avgt 5 6663.150 ± 60.651 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1000 𠜎 avgt 5 10032.003 ± 0.001 B/op
```
Just compare any pair with the same parameters. E.g. for a 100 char ASCII
string, there is no extra allocation but the time is halved:
```
DecodingBenchmark.decodeUTF8Charset
100 a avgt 5 17.833 ± 0.228 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
100 a avgt 5 120.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
100 a avgt 5 39.540 ± 0.426 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
100 a avgt 5 120.000 ± 0.001 B/op
```
On JDK8, the avgt is much higher but more is allocated when using
`StandardCharsets`
```
Benchmark
(length) (unit) Mode Cnt Score Error Units
DecodingBenchmark.decodeUTF8Charset
1 a avgt 5 58.534 ± 0.758 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1 a avgt 5 152.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1 ß avgt 5 65.887 ± 1.033 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1 ß avgt 5 152.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1 道 avgt 5 57.018 ± 1.035 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1 道 avgt 5 128.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1 𠜎 avgt 5 77.302 ± 2.559 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1 𠜎 avgt 5 176.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
10 a avgt 5 65.488 ± 0.488 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
10 a avgt 5 184.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
10 ß avgt 5 85.795 ± 1.593 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
10 ß avgt 5 192.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
10 道 avgt 5 84.273 ± 1.305 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
10 道 avgt 5 152.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
10 𠜎 avgt 5 137.654 ± 1.435 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
10 𠜎 avgt 5 264.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
100 a avgt 5 136.765 ± 2.063 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
100 a avgt 5 544.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
100 ß avgt 5 289.917 ± 4.506 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
100 ß avgt 5 640.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
100 道 avgt 5 371.064 ± 1.888 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
100 道 avgt 5 424.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
100 𠜎 avgt 5 812.540 ± 13.020 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
100 𠜎 avgt 5 1160.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1000 a avgt 5 1067.417 ± 24.818 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1000 a avgt 5 4136.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1000 ß avgt 5 2618.408 ± 38.177 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1000 ß avgt 5 5136.001 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1000 道 avgt 5 3433.173 ± 32.028 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1000 道 avgt 5 3120.002 ± 0.001 B/op
DecodingBenchmark.decodeUTF8Charset
1000 𠜎 avgt 5 7883.986 ± 79.200 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
1000 𠜎 avgt 5 10160.003 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1 a avgt 5 55.946 ± 0.497 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1 a avgt 5 48.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1 ß avgt 5 61.014 ± 2.516 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1 ß avgt 5 48.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1 道 avgt 5 56.231 ± 0.289 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1 道 avgt 5 24.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1 𠜎 avgt 5 74.309 ± 1.246 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1 𠜎 avgt 5 48.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
10 a avgt 5 61.794 ± 0.312 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
10 a avgt 5 80.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
10 ß avgt 5 84.202 ± 0.701 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
10 ß avgt 5 88.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
10 道 avgt 5 84.276 ± 0.999 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
10 道 avgt 5 48.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
10 𠜎 avgt 5 133.305 ± 1.293 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
10 𠜎 avgt 5 136.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
100 a avgt 5 133.185 ± 2.087 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
100 a avgt 5 440.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
100 ß avgt 5 248.623 ± 3.828 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
100 ß avgt 5 536.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
100 道 avgt 5 372.915 ± 14.813 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
100 道 avgt 5 320.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
100 𠜎 avgt 5 820.216 ± 82.970 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
100 𠜎 avgt 5 1032.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1000 a avgt 5 1066.354 ± 83.986 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1000 a avgt 5 4032.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1000 ß avgt 5 2700.836 ± 190.698 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1000 ß avgt 5 5032.001 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1000 道 avgt 5 3466.882 ± 288.632 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1000 道 avgt 5 3016.002 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
1000 𠜎 avgt 5 7986.518 ± 359.135 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
1000 𠜎 avgt 5 10032.003 ± 0.001 B/op
```
For the same 100 char ASCII string mentioned above, the user would be better
off upgrading their JVM than relying on this optimisation:
JDK11
```
DecodingBenchmark.decodeUTF8Charset
100 a avgt 5 17.833 ± 0.228 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
100 a avgt 5 120.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
100 a avgt 5 39.540 ± 0.426 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
100 a avgt 5 120.000 ± 0.001 B/op
```
JDK8
```
DecodingBenchmark.decodeUTF8Charset
100 a avgt 5 136.765 ± 2.063 ns/op
DecodingBenchmark.decodeUTF8Charset:·gc.alloc.rate.norm
100 a avgt 5 544.000 ± 0.001 B/op
DecodingBenchmark.decodeUTF8CharsetName
100 a avgt 5 133.185 ± 2.087 ns/op
DecodingBenchmark.decodeUTF8CharsetName:·gc.alloc.rate.norm
100 a avgt 5 440.000 ± 0.001 B/op
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]