Hi Sergey,
I took a look at the benchmark and I think there's more than 11 vs 14 in
play here.
When I compiled the benchmark with jdk8, I saw the following in the
compilation log (irrespective of jdk version used):
1221 67 b
org.openjdk.ea.StringCompositeKeyBenchmark::compositeKey (22 bytes)
...
@ 15 org.openjdk.ea.StringCompositeKeyBenchmark$Key::<init>
(7 bytes) unloaded signature classes
...
The constructor is not inlined, so even if the Key instance doesn't
escape globally, it escapes into a call and C2 can't scalar replace it.
The reason why inlining fails is private constructor can't be accessed
directly, but requires a bridge method. Bridge method has additional
method argument which has a non-existent type (with unique name).
Inlining heuristics don't inline methods which have unresolved classes
in their signatures.
But if you recompile the benchmark with jdk11 (or later), inlining
happends and the allocation is eliminated [1].
If you look at the bytecodes, there's no bridge method anymore. Javac
generates NestMembers attribute instead which is enough to make private
constructor accessible from the enclosing class:
$ javap -verbose -private
target/classes//org/openjdk/ea/StringCompositeKeyBenchmark.class
...
NestMembers:
org/openjdk/ea/StringCompositeKeyBenchmark$Key
org/openjdk/ea/StringCompositeKeyBenchmark$Data
...
So, it boils down to the target language level being used. Starting 11,
javac doesn't emit bridge methods anymore and it helps with getting EA
in C2 to eliminate the allocation.
Best regards,
Vladimir Ivanov
[1]
$ javap -verbose -private
target/classes//org/openjdk/ea/StringCompositeKeyBenchmark.class
public java.lang.Object
compositeKey(org.openjdk.ea.StringCompositeKeyBenchmark$Data);
descriptor:
(Lorg/openjdk/ea/StringCompositeKeyBenchmark$Data;)Ljava/lang/Object;
flags: (0x0001) ACC_PUBLIC
Code:
stack=6, locals=2, args_size=2
0: aload_1
1: invokestatic #9 // Method
org/openjdk/ea/StringCompositeKeyBenchmark$Data.access$200:(Lorg/openjdk/ea/StringCompositeKeyBenchmark$Data;)Ljava/util/HashMap;
4: new #13 // class
org/openjdk/ea/StringCompositeKeyBenchmark$Key
7: dup
8: ldc #15 // String code1
10: aload_1
11: invokestatic #17 // Method
org/openjdk/ea/StringCompositeKeyBenchmark$Data.access$000:(Lorg/openjdk/ea/StringCompositeKeyBenchmark$Data;)Ljava/util/Locale;
14: aconst_null
15: invokespecial #21 // Method
org/openjdk/ea/StringCompositeKeyBenchmark$Key."<init>":(Ljava/lang/String;Ljava/util/Locale;Lorg/openjdk/ea/StringCompositeKeyBenchmark$1;)V
18: invokevirtual #24 // Method
java/util/HashMap.get:(Ljava/lang/Object;)Ljava/lang/Object;
21: areturn
$ javap -verbose -private
target/classes//org/openjdk/ea/StringCompositeKeyBenchmark\$Key.class
...
org.openjdk.ea.StringCompositeKeyBenchmark$Key(java.lang.String,
java.util.Locale, org.openjdk.ea.StringCompositeKeyBenchmark$1);
descriptor:
(Ljava/lang/String;Ljava/util/Locale;Lorg/openjdk/ea/StringCompositeKeyBenchmark$1;)V
...
[2]
1426 67 b
org.openjdk.ea.StringCompositeKeyBenchmark::compositeKey (26 bytes)
======== Connection graph for
org.openjdk.ea.StringCompositeKeyBenchmark::compositeKey
JavaObject NoEscape(NoEscape) [ 148F 142F 144F 137F [ 58 63 ]] 46
Allocate === 29 6 7 8 1 ( 44 43 39 1 1 37 42 ) [[ 47 48 49
56 57 58 ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *,
bool, top ) StringCompositeKeyBenchmark::compositeKey @ bci:4 !jvms:
StringCompositeKeyBenchmark::compositeKey @ bci:4
LocalVar [ 46P [ 63 148b 142b ]] 58 Proj === 46 [[ 59 63 142 148
]] #5 !jvms: StringCompositeKeyBenchmark::compositeKey @ bci:4
LocalVar [ 58 46P [ 144b 137b ]] 63 CheckCastPP === 60 58 [[ 906
865 814 814 865 717 689 689 144 678 717 437 137 137 144
155 166 185 678 678 649 450 223 241 649 634 618 598 576
522 522 990 990 478 478 450 371 371 424 424 437 ]]
#org/openjdk/ea/StringCompositeKeyBenchmark$Key:NotNull:exact *
Oop:org/openjdk/ea/StringCompositeKeyBenchmark$Key:NotNull:exact *
!jvms: StringCompositeKeyBenchmark::compositeKey @ bci:4
Scalar 63 CheckCastPP === 60 58 [[ 906 865 814 814 865 717 689
689 424 990 717 437 522 522 424 437 166 185 990 478 478
450 223 241 450 634 618 598 576 ]]
#org/openjdk/ea/StringCompositeKeyBenchmark$Key:NotNull:exact *,iid=46
Oop:org/openjdk/ea/StringCompositeKeyBenchmark$Key:NotNull:exact
*,iid=46 !jvms: StringCompositeKeyBenchmark::compositeKey @ bci:4
++++ Eliminated: 46 Allocate
@ 9 java.util.Objects::requireNonNull (14
bytes) inline (hot)
@ 19
org.openjdk.ea.StringCompositeKeyBenchmark$Key::<init> (15 bytes)
inline (hot)
@ 1 java.lang.Object::<init> (1 bytes)
inline (hot)
On 26.06.2020 15:06, Сергей Цыпанов wrote:
Hello,
while looking into an issue I've found out that scalar replacement is not
working in trivial case on JDK 14.0.1.
This benchmark illustrates the issue:
@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
public class StringCompositeKeyBenchmark {
@Benchmark
public Object compositeKey(Data data) {
return data.keyObjectMap.get(new Key(data.code, data.locale));
}
@State(Scope.Thread)
public static class Data {
private final String code = "code1";
private final Locale locale = Locale.getDefault();
private final HashMap<Key, Object> keyObjectMap = new HashMap<>();
@Setup
public void setUp() {
keyObjectMap.put(new Key(code, locale), new Object());
}
}
private static final class Key {
private final String code;
private final Locale locale;
private Key(String code, Locale locale) {
this.code = code;
this.locale = locale;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Key key = (Key) o;
if (!code.equals(key.code)) return false;
return locale.equals(key.locale);
}
@Override
public int hashCode() {
return 31 * code.hashCode() + locale.hashCode();
}
}
}
When I run this on JDK 11 (JDK 11.0.7, OpenJDK 64-Bit Server VM,
11.0.7+10-post-Ubuntu-2ubuntu218.04) I get this output:
Benchmark Mode Cnt Score
Error Units
StringCompositeKeyBenchmark.compositeKey avgt 10 5.510
± 0.121 ns/op
StringCompositeKeyBenchmark.compositeKey:·gc.alloc.rate avgt 10 ≈ 10⁻⁴
MB/sec
StringCompositeKeyBenchmark.compositeKey:·gc.alloc.rate.norm avgt 10 ≈ 10⁻⁶
B/op
StringCompositeKeyBenchmark.compositeKey:·gc.count avgt 10 ≈ 0
counts
As I understand Java runtime erases object allocation here and we don't use
additional memory.
Same run on JDK 14 (JDK 14.0.1, Java HotSpot(TM) 64-Bit Server VM, 14.0.1+7)
demonstrate object allocation per each method call:
Benchmark Mode
Cnt Score Error Units
StringCompositeKeyBenchmark.compositeKey avgt
10 7.958 ± 1.360 ns/op
StringCompositeKeyBenchmark.compositeKey:·gc.alloc.rate avgt
10 1937.551 ± 320.718 MB/sec
StringCompositeKeyBenchmark.compositeKey:·gc.alloc.rate.norm avgt
10 24.001 ± 0.001 B/op
StringCompositeKeyBenchmark.compositeKey:·gc.churn.G1_Eden_Space avgt
10 1879.111 ± 596.770 MB/sec
StringCompositeKeyBenchmark.compositeKey:·gc.churn.G1_Eden_Space.norm avgt
10 23.244 ± 5.509 B/op
StringCompositeKeyBenchmark.compositeKey:·gc.churn.G1_Survivor_Space avgt
10 0.267 ± 0.750 MB/sec
StringCompositeKeyBenchmark.compositeKey:·gc.churn.G1_Survivor_Space.norm avgt
10 0.003 ± 0.009 B/op
StringCompositeKeyBenchmark.compositeKey:·gc.count avgt
10 23.000 counts
StringCompositeKeyBenchmark.compositeKey:·gc.time avgt
10 44.000 ms
At the same time in more trivial scenario like
@Benchmark
public int compositeKey(Data data) {
return new Key(data.code, data.locale).hashCode();
}
scalar replacement again eliminates allocation of object.
So I'm curious whether this is normal behaviour or a bug?
Regards,
Sergey Tsypanov