Use `JavaLangAccess::uncheckedNewStringNoRepl` in `MemorySegment::getString` to
avoid byte[] allocation in the String constructor.
Fall back to the old code in the case of malformed input to get replacement
characters as per Javadoc API specification. The existing tests in
`TestStringEncoding` seem sufficient to me.
I ran the tier1 test suite and it passes.
For performance analysis I ran.
make test TEST="micro:org.openjdk.bench.java.lang.foreign.ToJavaStringTest"
MICRO="OPTIONS=-prof gc"
on an AMD Ryzen 7 PRO 5750GE.
These are the formatted results, the current master is the line on top, this
feature branch is the line below. We can see an improvement in throughput
driven by a reduction in allocation.
Benchmark (size) Mode Cnt
Score Error Units
ToJavaStringTest.panama_readString 5 avgt 30
18.996 ± 0.044 ns/op
ToJavaStringTest.panama_readString 5 avgt 30
13.851 ± 0.028 ns/op
ToJavaStringTest.panama_readString 20 avgt 30
23.570 ± 0.050 ns/op
ToJavaStringTest.panama_readString 20 avgt 30
18.401 ± 0.069 ns/op
ToJavaStringTest.panama_readString 100 avgt 30
32.094 ± 0.207 ns/op
ToJavaStringTest.panama_readString 100 avgt 30
24.427 ± 0.112 ns/op
ToJavaStringTest.panama_readString 200 avgt 30
43.029 ± 0.185 ns/op
ToJavaStringTest.panama_readString 200 avgt 30
31.914 ± 0.064 ns/op
ToJavaStringTest.panama_readString 451 avgt 30
81.145 ± 0.403 ns/op
ToJavaStringTest.panama_readString 451 avgt 30
58.975 ± 0.233 ns/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm 5 avgt 30
72.000 ± 0.001 B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm 5 avgt 30
48.000 ± 0.001 B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm 20 avgt 30
104.000 ± 0.001 B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm 20 avgt 30
64.000 ± 0.001 B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm 100 avgt 30
264.000 ± 0.001 B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm 100 avgt 30
144.000 ± 0.001 B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm 200 avgt 30
456.001 ± 0.001 B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm 200 avgt 30
240.000 ± 0.001 B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm 451 avgt 30
968.001 ± 0.001 B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm 451 avgt 30
496.001 ± 0.001 B/op
I looked into whether there are inlining issues with the current or proposed
code. For this I ran
var segment = MemorySegment.ofArray(new byte[] {'c', 'o', 'f', 'f', 'e', 'e', '
', 'b', 'a', 'b', 'e', 0});
for (int i = 0; i < 200_000_000; i++) {
var string = segment.getString(0);
if (System.identityHashCode(string) == 1) {
System.out.println("x");
}
}
with `-XX:+PrintCompilation -XX:+PrintInlining -XX:+UnlockDiagnosticVMOptions `
for the current master the simplified output is like this
@ 75 jdk.internal.foreign.AbstractMemorySegmentImpl::getString (9 bytes)
force inline by annotation callee changed to help.GetString::main (107
bytes) -> TypeProfile (95646/95646 counts) =
jdk/internal/foreign/HeapMemorySegmentImpl$OfByte
@ 5 jdk.internal.foreign.AbstractMemorySegmentImpl::getString (12 bytes)
force inline by annotation
@ 1 java.util.Objects::requireNonNull (14 bytes) force inline by
annotation
@ 8 jdk.internal.foreign.StringSupport::read (67 bytes) force inline by
annotation
@ 1 jdk.internal.foreign.StringSupport$CharsetKind::of (102 bytes)
inline (hot)
@ 4 java.lang.Enum::ordinal (5 bytes) accessor
@ 45 jdk.internal.foreign.StringSupport::readByte (41 bytes) force
inline by annotation
@ 3 jdk.internal.foreign.AbstractMemorySegmentImpl::byteSize (5
bytes) accessor
@ 6 jdk.internal.foreign.StringSupport::strlenByte (232 bytes)
force inline by annotation
<snip>
@ 27 java.lang.foreign.MemorySegment::copy (29 bytes) force inline
by annotation
<snip>
@ 37 java.lang.String::<init> (16 bytes) inline (hot)
@ 2 java.util.Objects::requireNonNull (14 bytes) force inline by
annotation
@ 12 java.lang.String::<init> (86 bytes) inline (hot)
@ 23 java.lang.String::utf8 (271 bytes) inline (hot)
@ 9 java.lang.StringCoding::countPositives (33 bytes)
(intrinsic)
@ 27 java.util.Arrays::copyOfRange (82 bytes) inline (hot)
@ 11 java.lang.Object::clone (0 bytes) failed to inline:
native method (intrinsic)
@ 31 java.lang.String::<init> (15 bytes) inline (hot)
@ 1 java.lang.Object::<init> (1 bytes) inline (hot)
@ 82 java.lang.String::<init> (37 bytes) inline (hot)
@ 1 java.lang.Object::<init> (1 bytes) inline (hot)
As we can see everything is inlined but since `String::utf8` explicitly asks
for a copy of the `byte[]`
https://github.com/openjdk/jdk/blob/3263361a28c7e8c02734cb94bc9576e9f3ba5b50/src/java.base/share/classes/java/lang/String.java#L575
this can not be optimized away.
This is the compliation log with the proposed changes
@ 75 jdk.internal.foreign.AbstractMemorySegmentImpl::getString (9 bytes)
force inline by annotation callee changed to help.GetString::main (107
bytes) -> TypeProfile (121249/121249 counts) =
jdk/internal/foreign/HeapMemorySegmentImpl$OfByte
@ 5 jdk.internal.foreign.AbstractMemorySegmentImpl::getString (12 bytes)
force inline by annotation
@ 1 java.util.Objects::requireNonNull (14 bytes) force inline by
annotation
@ 8 jdk.internal.foreign.StringSupport::read (67 bytes) force inline by
annotation
@ 1 jdk.internal.foreign.StringSupport$CharsetKind::of (102 bytes)
inline (hot)
@ 4 java.lang.Enum::ordinal (5 bytes) accessor
@ 45 jdk.internal.foreign.StringSupport::readByte (55 bytes) force
inline by annotation
@ 3 jdk.internal.foreign.AbstractMemorySegmentImpl::byteSize (5
bytes) accessor
@ 6 jdk.internal.foreign.StringSupport::strlenByte (232 bytes)
force inline by annotation
<snip>
@ 27 java.lang.foreign.MemorySegment::copy (29 bytes) force inline
by annotation
<snip>
@ 36 java.lang.System$1::uncheckedNewStringNoRepl (6 bytes) inline
(hot)
@ 2 java.lang.String::newStringNoRepl (33 bytes) inline (hot)
@ 2 java.lang.String::newStringNoRepl1 (284 bytes) inline (hot)
@ 22 java.lang.String::newStringUTF8NoRepl (322 bytes) inline
(hot)
@ 4 java.lang.String::checkBoundsOffCount (10 bytes) inline
(hot)
@ 6 jdk.internal.util.Preconditions::checkFromIndexSize (25
bytes) inline (hot)
@ 24 java.lang.StringCoding::countPositives (33 bytes)
(intrinsic)
@ 73 java.lang.String::<init> (15 bytes) inline (hot)
@ 1 java.lang.Object::<init> (1 bytes) inline (hot)
everything still inlines even though some methods are larger.
-------------
Commit messages:
- 8362893: Improve performance for MemorySegment::getString
Changes: https://git.openjdk.org/jdk/pull/26493/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26493&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8362893
Stats: 19 lines in 1 file changed: 16 ins; 0 del; 3 mod
Patch: https://git.openjdk.org/jdk/pull/26493.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/26493/head:pull/26493
PR: https://git.openjdk.org/jdk/pull/26493