Re: [PATCH] Allocate uninitialized byte[] to improve performance of String-related operations

Paul Sandoz Wed, 09 Sep 2020 16:23:24 -0700

Hi Sergey,

Unsafe.allocateUninitializedArray should be used with great care, so as we are 
confident that no array, covering uninitialized memory, leaks out. Or later we 
unintentionally introduce a problem through refactoring.


The update to AbstractStringBuilder is problematic because the uninitialized 
array could potentially be accessed through reflection.

In other cases it seems you have slotted in the allocation at a low-level e.g. 
StringUTF16.newBytesFor. That makes it hard for me to evaluate the 
implications, and it's not clear to me it offers advantages in all cases, 
especially given the risks. The toCharArray/Utf allocation costs appear to be 
lost in the noise, at least from your benchmark results.

The method repeat seems a good candidate to focus on given the allocation and 
looping is obviously localized.
However, in the repeat of 1 case I would like to know why C2 does not elide the 
zeroing of the array, since it should know that the fill covers the whole 
array. 
For the repeat > 2 we can excuse C2 for not seeing through to the ranges passed 
to System.arraycopy.

Separately, renaming StringConcatHelper.new{Char}Array to 
StringConcatHelper.newUninitialized{Char}Array would be clearer, and perhaps 
moving to somewhere else.

Paul.

> On Sep 4, 2020, at 5:52 AM, Сергей Цыпанов <[email protected]> wrote:
> 
> Hello,
> 
> currently jdk.internal.misc.Unsafe declares method 
> allocateUninitializedArray(Class, int) returning uninitialized array of given 
> type and length.
> 
> Allocation of uninitialized arrays is faster especially for larger ones, so 
> we could use them for cases
> when we are sure that created array will be completely overwritten or is 
> guarded by count field (e.g. in AbstractStringBuilder).
> 
> I've exposed jdk.internal.misc.Unsafe.allocateUninitializedArray(Class, int)
> via delegating method of sun.misc.Unsafe to measure creation of byte[] with 
> benchmark [1]
> and got those results:
> 
>                    (length)  Mode  Cnt      Score      Error   Units
> constructor                0  avgt   50      7.639 ±    0.629   ns/op
> constructor               10  avgt   50      9.448 ±    0.725   ns/op
> constructor              100  avgt   50     21.158 ±    1.865   ns/op
> constructor             1000  avgt   50    146.158 ±    9.836   ns/op
> constructor            10000  avgt   50    916.321 ±   50.618   ns/op
> 
> unsafe                     0  avgt   50      8.057 ±    0.599   ns/op
> unsafe                    10  avgt   50      8.308 ±    0.907   ns/op
> unsafe                   100  avgt   50     12.232 ±    1.813   ns/op
> unsafe                  1000  avgt   50     37.679 ±    1.382   ns/op
> unsafe                 10000  avgt   50     78.896 ±    6.758   ns/op
> 
> As a result I propose to add the following methods into StringConcatHelper
> 
> @ForceInline
> static byte[] newArray(int length) {
>    return (byte[]) UNSAFE.allocateUninitializedArray(byte.class, length);
> }
> 
> @ForceInline
> static char[] newCharArray(int length) {
>    return (char[]) UNSAFE.allocateUninitializedArray(char.class, length);
> }
> 
> along with existing StringConcatHelper.newArray(long indexCoder) and utilize 
> them in String-related operations
> instead of conventional array creation with new-keyword.
> 
> I've used benchmark [2] to measure impact on affected String-methods and 
> found quite a good improvement:
> 
>                                                 before             after
> 
> Benchmark                   (length)      Score   Error     Score   Error    
> Units
> 
> newStringBuilderWithLength         8      8.288 ± 0.411    5.656 ±  0.019    
> ns/op
> newStringBuilderWithLength        64     12.954 ± 0.687    7.588 ±  0.009    
> ns/op
> newStringBuilderWithLength       128     20.603 ± 0.412   10.446 ±  0.005    
> ns/op
> newStringBuilderWithLength      1024    119.935 ± 2.383   35.452 ±  0.029    
> ns/op
> 
> newStringBuilderWithString         8     19.721 ± 0.375   14.642 ±  0.052    
> ns/op
> newStringBuilderWithString        64     34.006 ± 1.523   15.479 ±  0.031    
> ns/op
> newStringBuilderWithString       128     36.697 ± 0.972   17.052 ±  0.133    
> ns/op
> newStringBuilderWithString      1024    140.486 ± 6.396   85.156 ±  0.175    
> ns/op
> 
> repeatOneByteString                8     11.340 ± 0.197    9.736 ±  0.051    
> ns/op
> repeatOneByteString               64     20.859 ± 0.257   15.073 ±  0.024    
> ns/op
> repeatOneByteString              128     36.311 ± 1.162   22.808 ±  0.198    
> ns/op
> repeatOneByteString             1024    149.243 ± 3.083   82.839 ±  0.193    
> ns/op
> 
> repeatOneChar                      8     28.033 ± 0.615   20.377 ±  0.137    
> ns/op
> repeatOneChar                     64     56.399 ± 1.094   36.230 ±  0.051    
> ns/op
> repeatOneChar                    128     68.423 ± 5.647   44.157 ±  0.239    
> ns/op
> repeatOneChar                   1024    230.115 ± 0.312  179.126 ±  0.437    
> ns/op
> 
> replace                            8     14.684 ± 0.088   14.434 ±  0.057    
> ns/op
> replace                           64     56.811 ± 0.612   56.420 ±  0.050    
> ns/op
> replace                          128    112.694 ± 0.404  109.799 ±  1.202    
> ns/op
> replace                         1024    837.939 ± 0.855  818.802 ±  0.408    
> ns/op
> 
> replaceUtf                         8     17.802 ± 0.074   15.744 ±  0.094    
> ns/op
> replaceUtf                        64     45.754 ± 0.139   39.228 ±  0.864    
> ns/op
> replaceUtf                       128     67.170 ± 0.353   50.497 ±  1.218    
> ns/op
> replaceUtf                      1024    415.767 ± 6.829  297.831 ± 22.510    
> ns/op
> 
> toCharArray                        8      6.164 ± 0.033    6.128 ±  0.064    
> ns/op
> toCharArray                       64     10.960 ± 0.032   13.566 ±  0.802    
> ns/op
> toCharArray                      128     20.373 ± 0.013   20.823 ±  0.376    
> ns/op
> toCharArray                     1024    165.923 ± 0.098  164.362 ±  0.065    
> ns/op
> 
> toCharArrayUTF                     8      8.009 ± 0.067    7.778 ±  0.026    
> ns/op
> toCharArrayUTF                    64     11.112 ± 0.014   10.880 ±  0.010    
> ns/op
> toCharArrayUTF                   128     20.390 ± 0.014   20.103 ±  0.017    
> ns/op
> toCharArrayUTF                  1024    166.233 ± 0.076  163.827 ±  0.099    
> ns/op
> 
> So the question is could we include the changes of attached patch into JDK?
> 
> With best regards,
> Sergey Tsypanov
> 
> 
> 1. Benchmark for array-allocation
> 
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> @Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
> public class ArrayAllocationBenchmark {
>  private static Unsafe U;
> 
>  static {
>    try {
>      Field f = Unsafe.class.getDeclaredField("theUnsafe");
>      f.setAccessible(true);
>      U = (Unsafe) f.get(null);
>    } catch (Exception e) {
>      new RuntimeException(e);
>    }
>  }
> 
>  @Benchmark
>  public byte[] constructor(Data data) {
>    return new byte[data.length];
>  }
> 
>  @Benchmark
>  public byte[] unsafe(Data data) {
>    return (byte[]) U.allocateUninitializedArray(byte.class, data.length);
>  }
> 
>  @State(Scope.Thread)
>  public static class Data {
>    @Param({"0", "10", "100", "1000", "10000"})
>    private int length;
>  }
> }
> 
> 
> 2. Benchmark for String-method measurements
> 
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> @Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
> public class MiscStringBenchmark {
> 
>  @Benchmark
>  public char[] toCharArrayLatin1(Data data) {
>    return data.string.toCharArray();
>  }
> 
>  @Benchmark
>  public char[] toCharArrayUTF(Data data) {
>    return data.utfString.toCharArray();
>  }
> 
>  @Benchmark
>  public String repeatOneByteString(Data data) {
>    return data.oneCharString.repeat(data.length);
>  }
> 
>  @Benchmark
>  public String repeatOneChar(Data data) {
>    return data.oneUtfCharString.repeat(data.length);
>  }
> 
>  @Benchmark
>  public String replace(Data data){
>    return data.stringToReplace.replace('z', 'b');
>  }
> 
>  @Benchmark
>  public String replaceUtf(Data data){
>    return data.utfStringToReplace.replace('я', 'ю');
>  }
> 
>  @Benchmark
>  public StringBuilder newStringBuilderWithLength(Data data) {
>    return new StringBuilder(data.length);
>  }
> 
>  @Benchmark
>  public StringBuilder newStringBuilderWithString(Data data) {
>    return new StringBuilder(data.string);
>  }
> 
>  @State(Scope.Thread)
>  public static class Data {
> 
>    @Param({"8", "64", "128", "1024"})
>    private int length;
> 
>    private String string;
>    public String utfString;
>    private final String oneCharString = "a";
>    private final String oneUtfCharString = "ё";
>    private String stringToReplace;
>    private String utfStringToReplace;
> 
>    @Setup
>    public void setup() {
>      string = oneCharString.repeat(length);
>      utfString = oneUtfCharString.repeat(length);
> 
>      stringToReplace = string + 'z';
>      utfStringToReplace = utfString + 'я';
>    }
>  }
> }
> <uninit_array.patch>

Re: [PATCH] Allocate uninitialized byte[] to improve performance of String-related operations

Reply via email to