Hi Sergey, Unsafe.allocateUninitializedArray should be used with great care, so as we are confident that no array, covering uninitialized memory, leaks out. Or later we unintentionally introduce a problem through refactoring.
The update to AbstractStringBuilder is problematic because the uninitialized array could potentially be accessed through reflection. In other cases it seems you have slotted in the allocation at a low-level e.g. StringUTF16.newBytesFor. That makes it hard for me to evaluate the implications, and it's not clear to me it offers advantages in all cases, especially given the risks. The toCharArray/Utf allocation costs appear to be lost in the noise, at least from your benchmark results. The method repeat seems a good candidate to focus on given the allocation and looping is obviously localized. However, in the repeat of 1 case I would like to know why C2 does not elide the zeroing of the array, since it should know that the fill covers the whole array. For the repeat > 2 we can excuse C2 for not seeing through to the ranges passed to System.arraycopy. Separately, renaming StringConcatHelper.new{Char}Array to StringConcatHelper.newUninitialized{Char}Array would be clearer, and perhaps moving to somewhere else. Paul. > On Sep 4, 2020, at 5:52 AM, Сергей Цыпанов <sergei.tsypa...@yandex.ru> wrote: > > Hello, > > currently jdk.internal.misc.Unsafe declares method > allocateUninitializedArray(Class, int) returning uninitialized array of given > type and length. > > Allocation of uninitialized arrays is faster especially for larger ones, so > we could use them for cases > when we are sure that created array will be completely overwritten or is > guarded by count field (e.g. in AbstractStringBuilder). > > I've exposed jdk.internal.misc.Unsafe.allocateUninitializedArray(Class, int) > via delegating method of sun.misc.Unsafe to measure creation of byte[] with > benchmark [1] > and got those results: > > (length) Mode Cnt Score Error Units > constructor 0 avgt 50 7.639 ± 0.629 ns/op > constructor 10 avgt 50 9.448 ± 0.725 ns/op > constructor 100 avgt 50 21.158 ± 1.865 ns/op > constructor 1000 avgt 50 146.158 ± 9.836 ns/op > constructor 10000 avgt 50 916.321 ± 50.618 ns/op > > unsafe 0 avgt 50 8.057 ± 0.599 ns/op > unsafe 10 avgt 50 8.308 ± 0.907 ns/op > unsafe 100 avgt 50 12.232 ± 1.813 ns/op > unsafe 1000 avgt 50 37.679 ± 1.382 ns/op > unsafe 10000 avgt 50 78.896 ± 6.758 ns/op > > As a result I propose to add the following methods into StringConcatHelper > > @ForceInline > static byte[] newArray(int length) { > return (byte[]) UNSAFE.allocateUninitializedArray(byte.class, length); > } > > @ForceInline > static char[] newCharArray(int length) { > return (char[]) UNSAFE.allocateUninitializedArray(char.class, length); > } > > along with existing StringConcatHelper.newArray(long indexCoder) and utilize > them in String-related operations > instead of conventional array creation with new-keyword. > > I've used benchmark [2] to measure impact on affected String-methods and > found quite a good improvement: > > before after > > Benchmark (length) Score Error Score Error > Units > > newStringBuilderWithLength 8 8.288 ± 0.411 5.656 ± 0.019 > ns/op > newStringBuilderWithLength 64 12.954 ± 0.687 7.588 ± 0.009 > ns/op > newStringBuilderWithLength 128 20.603 ± 0.412 10.446 ± 0.005 > ns/op > newStringBuilderWithLength 1024 119.935 ± 2.383 35.452 ± 0.029 > ns/op > > newStringBuilderWithString 8 19.721 ± 0.375 14.642 ± 0.052 > ns/op > newStringBuilderWithString 64 34.006 ± 1.523 15.479 ± 0.031 > ns/op > newStringBuilderWithString 128 36.697 ± 0.972 17.052 ± 0.133 > ns/op > newStringBuilderWithString 1024 140.486 ± 6.396 85.156 ± 0.175 > ns/op > > repeatOneByteString 8 11.340 ± 0.197 9.736 ± 0.051 > ns/op > repeatOneByteString 64 20.859 ± 0.257 15.073 ± 0.024 > ns/op > repeatOneByteString 128 36.311 ± 1.162 22.808 ± 0.198 > ns/op > repeatOneByteString 1024 149.243 ± 3.083 82.839 ± 0.193 > ns/op > > repeatOneChar 8 28.033 ± 0.615 20.377 ± 0.137 > ns/op > repeatOneChar 64 56.399 ± 1.094 36.230 ± 0.051 > ns/op > repeatOneChar 128 68.423 ± 5.647 44.157 ± 0.239 > ns/op > repeatOneChar 1024 230.115 ± 0.312 179.126 ± 0.437 > ns/op > > replace 8 14.684 ± 0.088 14.434 ± 0.057 > ns/op > replace 64 56.811 ± 0.612 56.420 ± 0.050 > ns/op > replace 128 112.694 ± 0.404 109.799 ± 1.202 > ns/op > replace 1024 837.939 ± 0.855 818.802 ± 0.408 > ns/op > > replaceUtf 8 17.802 ± 0.074 15.744 ± 0.094 > ns/op > replaceUtf 64 45.754 ± 0.139 39.228 ± 0.864 > ns/op > replaceUtf 128 67.170 ± 0.353 50.497 ± 1.218 > ns/op > replaceUtf 1024 415.767 ± 6.829 297.831 ± 22.510 > ns/op > > toCharArray 8 6.164 ± 0.033 6.128 ± 0.064 > ns/op > toCharArray 64 10.960 ± 0.032 13.566 ± 0.802 > ns/op > toCharArray 128 20.373 ± 0.013 20.823 ± 0.376 > ns/op > toCharArray 1024 165.923 ± 0.098 164.362 ± 0.065 > ns/op > > toCharArrayUTF 8 8.009 ± 0.067 7.778 ± 0.026 > ns/op > toCharArrayUTF 64 11.112 ± 0.014 10.880 ± 0.010 > ns/op > toCharArrayUTF 128 20.390 ± 0.014 20.103 ± 0.017 > ns/op > toCharArrayUTF 1024 166.233 ± 0.076 163.827 ± 0.099 > ns/op > > So the question is could we include the changes of attached patch into JDK? > > With best regards, > Sergey Tsypanov > > > 1. Benchmark for array-allocation > > @BenchmarkMode(Mode.AverageTime) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > @Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"}) > public class ArrayAllocationBenchmark { > private static Unsafe U; > > static { > try { > Field f = Unsafe.class.getDeclaredField("theUnsafe"); > f.setAccessible(true); > U = (Unsafe) f.get(null); > } catch (Exception e) { > new RuntimeException(e); > } > } > > @Benchmark > public byte[] constructor(Data data) { > return new byte[data.length]; > } > > @Benchmark > public byte[] unsafe(Data data) { > return (byte[]) U.allocateUninitializedArray(byte.class, data.length); > } > > @State(Scope.Thread) > public static class Data { > @Param({"0", "10", "100", "1000", "10000"}) > private int length; > } > } > > > 2. Benchmark for String-method measurements > > @BenchmarkMode(Mode.AverageTime) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > @Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"}) > public class MiscStringBenchmark { > > @Benchmark > public char[] toCharArrayLatin1(Data data) { > return data.string.toCharArray(); > } > > @Benchmark > public char[] toCharArrayUTF(Data data) { > return data.utfString.toCharArray(); > } > > @Benchmark > public String repeatOneByteString(Data data) { > return data.oneCharString.repeat(data.length); > } > > @Benchmark > public String repeatOneChar(Data data) { > return data.oneUtfCharString.repeat(data.length); > } > > @Benchmark > public String replace(Data data){ > return data.stringToReplace.replace('z', 'b'); > } > > @Benchmark > public String replaceUtf(Data data){ > return data.utfStringToReplace.replace('я', 'ю'); > } > > @Benchmark > public StringBuilder newStringBuilderWithLength(Data data) { > return new StringBuilder(data.length); > } > > @Benchmark > public StringBuilder newStringBuilderWithString(Data data) { > return new StringBuilder(data.string); > } > > @State(Scope.Thread) > public static class Data { > > @Param({"8", "64", "128", "1024"}) > private int length; > > private String string; > public String utfString; > private final String oneCharString = "a"; > private final String oneUtfCharString = "ё"; > private String stringToReplace; > private String utfStringToReplace; > > @Setup > public void setup() { > string = oneCharString.repeat(length); > utfString = oneUtfCharString.repeat(length); > > stringToReplace = string + 'z'; > utfStringToReplace = utfString + 'я'; > } > } > } > <uninit_array.patch>