Hello,
recently I've run into an issue regarding String concatenation. This benchmark
summarizes it:
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class BrokenConcatenationBenchmark {
@Benchmark
public String slow(Data data) {
final Class<? extends Data> clazz = data.clazz;
return "class " + clazz.getName();
}
@Benchmark
public String fast(Data data) {
final Class<? extends Data> clazz = data.clazz;
final String clazzName = clazz.getName();
return "class " + clazzName;
}
@State(Scope.Thread)
public static class Data {
final Class<? extends Data> clazz = getClass();
@Setup
public void setup() {
//explicitly load name via native method Class.getName0()
clazz.getName();
}
}
}
On JDK 1.8.0_222 (OpenJDK 64-Bit Server VM, 25.222-b10) I've got the following
results:
Benchmark Mode Cnt
Score Error Units
BrokenConcatenationBenchmark.fast avgt 25
22,253 ± 0,962 ns/op
BrokenConcatenationBenchmark.fast:·gc.alloc.rate avgt 25
9824,603 ± 400,088 MB/sec
BrokenConcatenationBenchmark.fast:·gc.alloc.rate.norm avgt 25
240,000 ± 0,001 B/op
BrokenConcatenationBenchmark.fast:·gc.churn.PS_Eden_Space avgt 25
9824,162 ± 397,745 MB/sec
BrokenConcatenationBenchmark.fast:·gc.churn.PS_Eden_Space.norm avgt 25
239,994 ± 0,522 B/op
BrokenConcatenationBenchmark.fast:·gc.churn.PS_Survivor_Space avgt 25
0,040 ± 0,011 MB/sec
BrokenConcatenationBenchmark.fast:·gc.churn.PS_Survivor_Space.norm avgt 25
0,001 ± 0,001 B/op
BrokenConcatenationBenchmark.fast:·gc.count avgt 25
3798,000 counts
BrokenConcatenationBenchmark.fast:·gc.time avgt 25
2241,000 ms
BrokenConcatenationBenchmark.slow avgt 25
54,316 ± 1,340 ns/op
BrokenConcatenationBenchmark.slow:·gc.alloc.rate avgt 25
8435,703 ± 198,587 MB/sec
BrokenConcatenationBenchmark.slow:·gc.alloc.rate.norm avgt 25
504,000 ± 0,001 B/op
BrokenConcatenationBenchmark.slow:·gc.churn.PS_Eden_Space avgt 25
8434,983 ± 198,966 MB/sec
BrokenConcatenationBenchmark.slow:·gc.churn.PS_Eden_Space.norm avgt 25
503,958 ± 1,000 B/op
BrokenConcatenationBenchmark.slow:·gc.churn.PS_Survivor_Space avgt 25
0,127 ± 0,011 MB/sec
BrokenConcatenationBenchmark.slow:·gc.churn.PS_Survivor_Space.norm avgt 25
0,008 ± 0,001 B/op
BrokenConcatenationBenchmark.slow:·gc.count avgt 25
3789,000 counts
BrokenConcatenationBenchmark.slow:·gc.time avgt 25
2245,000 ms
This looks like an issue similar to JDK-8043677 [1], where an expression having
side effect
breaks optimization of new StringBuilder.append().append().toString() chain.
But the code of Class.getName() itself does seem to have any side effects:
------------------------------------------
private transient String name;
public String getName() {
String name = this.name;
if (name == null) {
this.name = name = this.getName0();
}
return name;
}
private native String getName0();
------------------------------------------
The only suspicious thing here is a call to native method which happens
in fact only once and its result is cached in the field of the class.
In my benchmark I've explicitly cached it in setup method.
I've expected branch predictor to figure out that at each benchmark invocation
the actual value of this.name is never null and optimize the whole expression.
However, while for the fast() I have this:
@ 19
tsypanov.strings.benchmark.concatenation.BrokenConcatenationBenchmark::fast (30
bytes) force inline by CompileCommand
@ 6 java.lang.Class::getName (18 bytes) inline (hot)
@ 14 java.lang.Class::initClassName (0 bytes) native method
@ 14 java.lang.StringBuilder::<init> (7 bytes) inline (hot)
@ 19 java.lang.StringBuilder::append (8 bytes) inline (hot)
@ 23 java.lang.StringBuilder::append (8 bytes) inline (hot)
@ 26 java.lang.StringBuilder::toString (35 bytes) inline (hot)
i.e. compiler is able to inline everything for slow() it is different:
@ 19
tsypanov.strings.benchmark.concatenation.BrokenConcatenationBenchmark::slow (28
bytes) force inline by CompilerOracle
@ 9 java.lang.StringBuilder::<init> (7 bytes) inline (hot)
@ 3 java.lang.AbstractStringBuilder::<init> (12 bytes) inline (hot)
@ 1 java.lang.Object::<init> (1 bytes) inline (hot)
@ 14 java.lang.StringBuilder::append (8 bytes) inline (hot)
@ 2 java.lang.AbstractStringBuilder::append (50 bytes) inline (hot)
@ 10 java.lang.String::length (6 bytes) inline (hot)
@ 21 java.lang.AbstractStringBuilder::ensureCapacityInternal (27 bytes)
inline (hot)
@ 17 java.lang.AbstractStringBuilder::newCapacity (39 bytes) inline
(hot)
@ 20 java.util.Arrays::copyOf (19 bytes) inline (hot)
@ 11 java.lang.Math::min (11 bytes) (intrinsic)
@ 14 java.lang.System::arraycopy (0 bytes) (intrinsic)
@ 35 java.lang.String::getChars (62 bytes) inline (hot)
@ 58 java.lang.System::arraycopy (0 bytes) (intrinsic)
@ 18 java.lang.Class::getName (21 bytes) inline (hot)
@ 11 java.lang.Class::getName0 (0 bytes) native method
@ 21 java.lang.StringBuilder::append (8 bytes) inline (hot)
@ 2 java.lang.AbstractStringBuilder::append (50 bytes) inline (hot)
@ 10 java.lang.String::length (6 bytes) inline (hot)
@ 21 java.lang.AbstractStringBuilder::ensureCapacityInternal (27 bytes)
inline (hot)
@ 17 java.lang.AbstractStringBuilder::newCapacity (39 bytes) inline
(hot)
@ 20 java.util.Arrays::copyOf (19 bytes) inline (hot)
@ 11 java.lang.Math::min (11 bytes) (intrinsic)
@ 14 java.lang.System::arraycopy (0 bytes) (intrinsic)
@ 35 java.lang.String::getChars (62 bytes) inline (hot)
@ 58 java.lang.System::arraycopy (0 bytes) (intrinsic)
@ 24 java.lang.StringBuilder::toString (17 bytes) inline (hot)
So the question is whether this is appropriate behaviour of the JVM or compiler
bug?
I'm asking the question because some of the projects are still using Java 8 and
if it won't be fixed on any of release updates
then it's reasonable to hoist calls to Class.getName() manually from hot spots.
[1] https://bugs.openjdk.java.net/browse/JDK-8043677