On Tue, 11 Jan 2022 13:02:21 GMT, Claes Redestad <redes...@openjdk.org> wrote:
>> In `String.encodeUTF8_UTF16`, making the `char c` local to each loop helps >> the performance of the method by helping C2 optimize each individual loop >> better. >> >> Results on the updated micros: >> 19-b04: >> >> Benchmark (charsetName) Mode Cnt Score >> Error Units >> StringEncode.encodeUTF16 UTF-8 avgt 15 164.457 ± >> 7.296 ns/op >> StringEncode.encodeUTF16LongEnd UTF-8 avgt 15 2294.160 ± >> 40.580 ns/op >> StringEncode.encodeUTF16LongStart UTF-8 avgt 15 9128.698 ± >> 124.636 ns/op >> >> >> Patch: >> >> Benchmark (charsetName) Mode Cnt Score >> Error Units >> StringEncode.encodeUTF16 UTF-8 avgt 15 131.296 ± >> 6.693 ns/op >> StringEncode.encodeUTF16LongEnd UTF-8 avgt 15 2282.750 ± >> 46.891 ns/op >> StringEncode.encodeUTF16LongStart UTF-8 avgt 15 4786.965 ± >> 64.896 ns/op >> >> >> Going back this seem to have been an issue since this method was introduced >> with JEP 254 in JDK 9: >> >> 8u311: >> >> Benchmark (charsetName) Mode Cnt Score >> Error Units >> StringEncode.encodeUTF16 UTF-8 avgt 15 194.057 ± >> 3.913 ns/op >> StringEncode.encodeUTF16LongEnd UTF-8 avgt 15 3024.860 ± >> 88.386 ns/op >> StringEncode.encodeUTF16LongStart UTF-8 avgt 15 5282.849 ± >> 247.230 ns/op >> >> 9: >> >> Benchmark (charsetName) Mode Cnt Score >> Error Units >> StringEncode.encodeUTF16 UTF-8 avgt 15 148.481 ± >> 9.374 ns/op >> StringEncode.encodeUTF16LongEnd UTF-8 avgt 15 2832.754 ± >> 263.372 ns/op >> StringEncode.encodeUTF16LongStart UTF-8 avgt 15 10447.115 ± >> 408.338 ns/op >> >> >> (Interestingly JDK 9 seem faster on some of the micros than later >> iterations, while slower on others. The main issue is the slow non-ASCII >> loop, where the patch speeds things up ~2x. With the patch we're >> significantly faster than both JDK 8 and 9 on all measures.) >> >> There's a JIT compiler bug hiding in plain sight here where the potentially >> uninitialized local `char c` appears to mess up optimization of the second >> loop. I'll file a separate bug for this (a standalone reproducer should be >> straightforward to produce). I think this patch is reasonable to put back >> into the JDK while we investigate if/how C2 can better handle this kind of >> condition. It might also be easier to backport, depending on whether the C2 >> fix is trivial or not. > > Claes Redestad has updated the pull request incrementally with one additional > commit since the last revision: > > Remove unused Blackholes Looks fine, consider touchups in benchmark code. test/micro/org/openjdk/bench/java/lang/StringEncode.java line 113: > 111: > 112: @Benchmark > 113: public byte[] encodeUTF16LongEnd(Blackhole bh) throws Exception { Here and later: can drop `bh` argument, since you don't need it anymore. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7026