Hi again, after an embarrassing attempt at using HotSpot's modified UTF-8 utilities as a drop-in implementation for real UTF-8 a few weeks ago, I've exploredvarious better (read: actually working) alternatives.
While I've experimented with a few different implementations[1], my favored approach is to add a fast path in the JNI code if the String is Latin1 coded, but defer to Java code for UTF16 Strings. This keeps the amount of JNI code we have to maintain in tandem with the Java implementation from blowing out of proportion. Overall this gives us a speedup of around 40% for ASCII/Latin1 Strings, while not regressing noticeably for UTF16 encoded Strings. JDK webrev: http://cr.openjdk.java.net/~redestad/8181147/jdk.04/ Top webrev: http://cr.openjdk.java.net/~redestad/8181147/top.04/ Bug: https://bugs.openjdk.java.net/browse/JDK-8181147 Special thanks to Erik Joelsson and Chris Hegarty for helping me piece together the changes necessary to add a sanity test for this code. If there's preference I don't mind splitting that part off as a separate RFE, as I think sanity testing should be added in this area independently of the actual code changes here. Thanks! /Claes [1] Some attempts used GetStringChars (or GetStringCritical), but the issue with these is that they add a number of unavoidable mallocs for latin1 Strings - since the jbyte array is inflated to a jchar array - which actually slows things down (and might even be slower than the baseline in some cases when NMT is enabled, since JNI code "cheats" and doesn't use NMT to track mallocs): http://cr.openjdk.java.net/~redestad/8181147/jdk.01/ Not wanting to move forward with a solution that actually regress performance in certain cases, I explored ways to access the byte array directly to avoid extra mallocs. Thinking that using GetByteField and friends would be prohibitively expensive, I first implemented a version using special purpose JNI methods on the HotSpot side. This narrowly beats the approach in the proposed version in terms of raw throughput. For a small (<10%) gain though, it doesn't seem worthwhile to go through the process of adding such special purpose JNI methods - but it was a fun experiment: http://cr.openjdk.java.net/~redestad/8181147/jdk.03/ http://cr.openjdk.java.net/~redestad/8181147/hotspot.03/