Hi Christopher,

A surprising thing is that skipping the new instance if it already fulfills
the target makes it faster cause the JVM does it in both latin and utf16
cases, is it due to the fact it does not even check if the chars are
surrogate which globally means the boost comes from the limited char set
more than the algo?

Anyway while it does not add much code a perf boost is always good so +1
too for the reasons you mentionned.

Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>


Le mer. 13 sept. 2023 à 23:08, Mark Thomas <ma...@apache.org> a écrit :

> So we are talking about a saving of around 35 nanoseconds per call.
>
> I just ran Tomcat 10.1.x through a profiler and requesting the Tomcat
> homepage triggered String.toLowerCase() just under 100 times (some of
> those calls may be from other JRE methods). So at best, we are looking
> at 3 microseconds per request. That is pretty small but it all adds up
> over time so +1 from me providing we have some sort of unit tests that
> confirms the custom code is faster than the JRE code in case
> circumstances change in the future.
>
> Mark
>
>
> On 13/09/2023 17:38, Christopher Schultz wrote:
> > All,
> >
> > Ping. I've added a few other implementations which will e.g. perform no
> > char-copy if the string is already in lower-case, so they are faster
> > under special circumstances.
> >
> > I'm happy to share my jmh runs, which seem to show that Java's
> > String.toLowerCase is getting faster and faster every time I run the
> > benchmark, which is puzzling.
> >
> > Thanks,
> > -chris
> >
> > On 9/8/23 13:39, Christopher Schultz wrote:
> >> All,
> >>
> >> Please ignore the fact that my benchmark is all oriented around
> >> toUpperCase instead of toLowerCase :)
> >>
> >> -chris
> >>
> >> On 9/8/23 13:25, Christopher Schultz wrote:
> >>> All,
> >>>
> >>> There are many cases in Tomcat where we change the letter-case of a
> >>> String value so it's easier to compare when case doesn't matter. In
> >>> particular, HTTP header names and many spec-defined values are
> >>> supposed to be case-insensitive and so all comparisons involving them
> >>> must be done without regard to letter-case.
> >>>
> >>> The idiom in Tomcat source code for that is[1]:
> >>>
> >>>      collection.add(element.toLowerCase(Locale.ENGLISH));
> >>>
> >>> Locale.ENGLISH is used because all of these values are supposed to be
> >>> in ASCII encoding and Locale.ENGLISH is as good as any equivalent
> >>> Locale that (nominally) uses (mostly) ASCII semantics.
> >>>
> >>> It turns out that String.toLowerCase (and it's mirror,
> >>> String.toUpperCase) has a ton of code in it to manage the many
> >>> complexities of Locales in which we are not interested.
> >>>
> >>> Implementing an ASCII-only version of toLowerCase appears to have a
> >>> speed improvement of roughly 2x for some simple cases. I have a
> >>> sample microbenchmark below and the output of jmh on Java 17.
> >>>
> >>> Given the frequency of calls to toLowerCase (many ties per request),
> >>> I think it may be a worthwhile performance improvement to implement
> >>> and use our own version of toLowerCase and use it when only ASCII is
> >>> expected.
> >>>
> >>> It may even be possible to write a more complicated version of
> >>> toLowerCase than I have below that performs even faster (e.g. for
> >>> String values that end up not having any upper-case characters at all).
> >>>
> >>> WDYT?
> >>>
> >>> -chris
> >>>
> >>> [1]
> >>>
> https://github.com/apache/tomcat/blob/feb77a15849389001ebcfdd623df86a42a62019e/java/org/apache/tomcat/util/http/parser/TokenList.java#L95
> >>>
> >>> Benchmark                                Mode  Cnt         Score
> >>> Error Units
> >>> MyBenchmark.testStringToUpperCase       thrpt    5  28130795.259 ±
> >>> 1297495.570  ops/s
> >>> MyBenchmark.testStringToUpperCaseASCII  thrpt    5  52221288.421 ±
> >>> 5112349.492  ops/s
> >>>
> >>> Source:
> >>>
> >>> import java.util.concurrent.TimeUnit;
> >>>
> >>> import org.openjdk.jmh.runner.Runner;
> >>> import org.openjdk.jmh.runner.options.Options;
> >>> import org.openjdk.jmh.runner.options.OptionsBuilder;
> >>> import org.openjdk.jmh.annotations.Benchmark;
> >>> import org.openjdk.jmh.annotations.BenchmarkMode;
> >>> import org.openjdk.jmh.annotations.Fork;
> >>> import org.openjdk.jmh.annotations.Measurement;
> >>> import org.openjdk.jmh.annotations.Mode;
> >>> import org.openjdk.jmh.annotations.Warmup;
> >>>
> >>> @Warmup(iterations=5, time=5, timeUnit=TimeUnit.SECONDS)
> >>> @Measurement(iterations=5, time=5, timeUnit=TimeUnit.SECONDS)
> >>> @BenchmarkMode(Mode.Throughput)
> >>> @Fork(1)
> >>> public class MyBenchmark {
> >>>
> >>>      private static final String SOURCE = "X-Frame-Options";
> >>>
> >>>      @Benchmark
> >>>      public String testStringToUpperCase() {
> >>>          return SOURCE.toUpperCase();
> >>>      }
> >>>
> >>>      @Benchmark
> >>>      public String testStringToUpperCaseASCII() {
> >>>          return toUpperCaseASCII(SOURCE);
> >>>      }
> >>>
> >>>      public String toUpperCaseASCII(String s) {
> >>>          int len = s.length();
> >>>          char[] result = new char[len];
> >>>          for(int i=0; i<len; i++) {
> >>>              char c = s.charAt(i);
> >>>
> >>>              if(c >= 'a' && c <= 'z') {
> >>>                  c -= 32;
> >>>              }
> >>>
> >>>              result[i] = c;
> >>>          }
> >>>
> >>>          return new String(result);
> >>>      }
> >>> }
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
> > For additional commands, e-mail: dev-h...@tomcat.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: dev-h...@tomcat.apache.org
>
>

Reply via email to