Hello! I extract JDK 9 code from String, StringLatin1, StringUTF16 and optimize some functions https://github.com/volodin-aa/openjdk-benchmark
My optimization: 1. Create fullMatchesCI() methods with simple for (...) loop https://github.com/volodin-aa/openjdk-benchmark/blob/master/src/su/tweak/openjdk/FastStringUTF16.java#L101 2. Remove 'constants' for equalsIgnoreCase() https://github.com/volodin-aa/openjdk-benchmark/blob/master/src/su/tweak/openjdk/FastString.java#L213 3. Optimize upper case verifying http://hg.openjdk.java.net/jdk9/jdk9/jdk/file/56379812ec5e/src/java.base/share/classes/java/lang/StringUTF16.java#l549 Character.toUpperCase(c2) more expencive then simple if (...) and I write if (u1 == c2) instead if (u1 == u2) ... // Try convert c1 to upper case and compare with c2 char u1 = Character.toUpperCase(c1); if (u1 == c2) { continue; } // Try convert c2 to upper case and compare with u1 char u2 = Character.toUpperCase(c2); if (u1 == u2) { continue; } ... For 'random case' strings it is more efficient. And developers can optimeze code like this http://hg.openjdk.java.net/jdk9/jdk9/jdk/file/56379812ec5e/src/java.base/share/classes/java/lang/Boolean.java#l130 public static boolean parseBoolean(String s) { return "true".equalsIgnoreCase(s); } My results https://docs.google.com/spreadsheets/d/1_Fxky6ftZTHpyuRJbM9wvqOzpoORLpIeciukMWuYuXw/pubhtml In table I compute equalsIgnoreCase() time without initialisation time. For case "string" and "String" my version more faster then internalVM version. In other cases VM optimization is faster. 27.04.2016, 14:38, "Aleksey Shipilev" <aleksey.shipi...@oracle.com>: > Hi Andrey, > > On 04/27/2016 12:57 PM, Andrey wrote: >> I publish my JMH benchmark at github >> https://github.com/volodin-aa/openjdk-benchmark > > Please note that you really should compete with JDK 9 String, not with > JDK 8. String.equalsIgnoreCase is different in JDK 9, and the obvious > improvement one can do there is: > > diff -r 5a6df35b0f97 src/java.base/share/classes/java/lang/StringLatin1.java > --- a/src/java.base/share/classes/java/lang/StringLatin1.java Wed Apr 27 > 09:13:51 2016 +0200 > +++ b/src/java.base/share/classes/java/lang/StringLatin1.java Wed Apr 27 > 14:16:14 2016 +0300 > @@ -315,11 +315,14 @@ > byte[] other, int ooffset, > int len) { > int last = toffset + len; > while (toffset < last) { > - char c1 = (char)(value[toffset++] & 0xff); > - char c2 = (char)(other[ooffset++] & 0xff); > - if (c1 == c2) { > + byte b1 = value[toffset++]; > + byte b2 = other[ooffset++]; > + if (b1 == b2) { > continue; > } > + char c1 = (char)(b1 & 0xff); > + char c2 = (char)(b2 & 0xff); > + > char u1 = Character.toUpperCase(c1); > char u2 = Character.toUpperCase(c2); > if (u1 == u2) { > > ...which improves performance even on short Strings. Maybe specializing > Character.toUpperCase for Latin1 strings would pay off more. Maybe > specializing regionMatches for complete Strings would worth the > increased complexity too, but that should be tried on JDK code first. > > To make a good justification for the change, the benchmark should really > test: > a) Different lengths; > b) Different starting mismatch offset; > c) Different Latin1/UTF16 pairs. > > Note well: these optimizations would require studying the generated code > looking for the compiler quirks, and as such would require much more > time than anyone thinks it would take (even after taking Hofstadter's > Law into account). > > Thanks, > -Aleksey