> In `String.encodeUTF8_UTF16`, making the `char c` local to each loop helps 
> the performance of the method by helping C2 optimize each individual loop 
> better.
> 
> Results on the updated micros:
> 19-b04:
> 
> Benchmark                          (charsetName)  Mode  Cnt     Score     
> Error  Units
> StringEncode.encodeUTF16                   UTF-8  avgt   15   164.457 ±   
> 7.296  ns/op
> StringEncode.encodeUTF16LongEnd            UTF-8  avgt   15  2294.160 ±  
> 40.580  ns/op
> StringEncode.encodeUTF16LongStart          UTF-8  avgt   15  9128.698 ± 
> 124.636  ns/op
> 
> 
> Patch:
> 
> Benchmark                          (charsetName)  Mode  Cnt     Score    
> Error  Units
> StringEncode.encodeUTF16                   UTF-8  avgt   15   131.296 ±  
> 6.693  ns/op
> StringEncode.encodeUTF16LongEnd            UTF-8  avgt   15  2282.750 ± 
> 46.891  ns/op
> StringEncode.encodeUTF16LongStart          UTF-8  avgt   15  4786.965 ± 
> 64.896  ns/op
> 
> 
> Going back this seem to have been an issue since this method was introduced 
> with JEP 254 in JDK 9:
> 
> 8u311:
> 
> Benchmark                          (charsetName)  Mode  Cnt     Score     
> Error  Units
> StringEncode.encodeUTF16                   UTF-8  avgt   15   194.057 ±   
> 3.913  ns/op
> StringEncode.encodeUTF16LongEnd            UTF-8  avgt   15  3024.860 ±  
> 88.386  ns/op
> StringEncode.encodeUTF16LongStart          UTF-8  avgt   15  5282.849 ± 
> 247.230  ns/op
> 
> 9:
> 
> Benchmark                          (charsetName)  Mode  Cnt      Score     
> Error  Units
> StringEncode.encodeUTF16                   UTF-8  avgt   15    148.481 ±   
> 9.374  ns/op
> StringEncode.encodeUTF16LongEnd            UTF-8  avgt   15   2832.754 ± 
> 263.372  ns/op
> StringEncode.encodeUTF16LongStart          UTF-8  avgt   15  10447.115 ± 
> 408.338  ns/op
> 
> 
> (Interestingly JDK 9 seem faster on some of the micros than later iterations, 
> while slower on others. The main issue is the slow non-ASCII loop, where the 
> patch speeds things up ~2x. With the patch we're significantly faster than 
> both JDK 8 and 9 on all measures.)
> 
> There's a JIT compiler bug hiding in plain sight here where the potentially 
> uninitialized local `char c` appears to mess up optimization of the second 
> loop. I'll file a separate bug for this (a standalone reproducer should be 
> straightforward to produce). I think this patch is reasonable to put back 
> into the JDK while we investigate if/how C2 can better handle this kind of 
> condition. It might also be easier to backport, depending on whether the C2 
> fix is trivial or not.

Claes Redestad has updated the pull request incrementally with one additional 
commit since the last revision:

  Happy new year!

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7026/files
  - new: https://git.openjdk.java.net/jdk/pull/7026/files/5b3dd69a..f13ff236

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7026&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7026&range=02-03

  Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7026.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7026/head:pull/7026

PR: https://git.openjdk.java.net/jdk/pull/7026

Reply via email to