I think you've fallen into the trap of confusing UTF-8 and Modified UTF-8. The prevailing pattern seems to be to write an entirely new UTF-8 implementation for any performance problem involving UTF-8 :-( e.g. in StringCoding.
On Fri, May 26, 2017 at 10:58 AM, Claes Redestad <[email protected]> wrote: > Hi, > > various JNI methods in the JDK converts from java Strings to native > encoding using JNI_GetStringPlatformChars, which has long standing > optimizations for dealing with various default charsets. > > However, UTF-8 is missing, which is a shame since we have optimized > utilities for converting from a String to UTF-8-encoded char* (since > this is the native encoding used by HotSpot internally). > > Webrev: http://cr.openjdk.java.net/~redestad/8181147/jdk.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8181147 > > Allocation rate can drop significantly in microbenchmarks, e.g., -60% > in a trivial micro doing new File(path).isHidden() (along with a 30% > throughput win), while the rate of native allocation with is net neutral. > > (I think the code for FAST_646_US could be removed, since Solaris 8 > support was dropped in JDK 8, but that's a separate RFE...) > > Thanks! > > /Claes >
