On Mon, 13 Dec 2021 09:39:55 GMT, Сергей Цыпанов <d...@openjdk.java.net> wrote:

> Originally this was spotted by by Amir Hadadi in 
> https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor
> 
> It looks like in the following code in `String(byte[], int, int, Charset)`
> 
> while (offset < sl) {
>     int b1 = bytes[offset];
>     if (b1 >= 0) {
>         dst[dp++] = (byte)b1;
>         offset++;                                  // <---
>         continue;
>     }
>     if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) &&
>             offset + 1 < sl) {
>         int b2 = bytes[offset + 1];
>         if (!isNotContinuation(b2)) {
>             dst[dp++] = (byte)decode2(b1, b2);
>             offset += 2;
>             continue;
>         }
>     }
>     // anything not a latin1, including the repl
>     // we have to go with the utf16
>     break;
> }
> 
> bounds check elimination is not executed when accessing byte array via 
> `bytes[offset].
> 
> The reason, I guess, is that offset variable is modified within the loop 
> (marked with arrow).
> 
> Possible fix for this could be changing:
> 
> `while (offset < sl)` ---> `while (offset >= 0 && offset < sl)`
> 
> However the best is to invest in C2 optimization to handle all such cases.
> 
> The following benchmark demonstrates good improvement:
> 
> @State(Scope.Thread)
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> public class StringConstructorBenchmark {
>   private byte[] array;
>   private String str;
> 
>   @Setup
>   public void setup() {
>     str = "Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen. 
> Я";//Latin1 ending with Russian
>     array = str.getBytes(StandardCharsets.UTF_8);
>   }
> 
>   @Benchmark
>   public String newString()  {
>       return new String(array, 0, array.length, StandardCharsets.UTF_8);
>   }
> 
>   @Benchmark
>   public String translateEscapes()  {
>       return str.translateEscapes();
>   }
> }
> 
> Results:
> 
> //baseline
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.newString avgt 50 173,092 ± 3,048 ns/op
> 
> //patched
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.newString avgt 50 126,908 ± 2,355 ns/op
> 
> The same is observed in String.translateEscapes() for the same String as in 
> the benchmark above:
> 
> //baseline
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.translateEscapes avgt 100 53,627 ± 0,850 ns/op
> 
> //patched
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.translateEscapes avgt 100 48,087 ± 1,129 ns/op
> 
> Also I've looked into this with `LinuxPerfAsmProfiler`, full output for 
> baseline is available here 
> https://gist.github.com/stsypanov/d2524f98477d633fb1d4a2510fedeea6 and for 
> patched code here 
> https://gist.github.com/stsypanov/16c787e4f9fa3dd122522f16331b68b7.
> 
> Here's the part of baseline assembly responsible for `while` loop: 
> 
>   3.62%           ││            │  0x00007fed70eb4c1c:   mov    %ebx,%ecx
>   2.29%           ││            │  0x00007fed70eb4c1e:   mov    %edx,%r9d
>   2.22%           ││            │  0x00007fed70eb4c21:   mov    (%rsp),%r8    
>                ;*iload_2 {reexecute=0 rethrow=0 return_oop=0}
>                   ││            │                                             
>                ; - java.lang.String::&lt;init&gt;@107 (line 537)
>   2.32%           ↘│            │  0x00007fed70eb4c25:   cmp    %r13d,%ecx
>                    │            │  0x00007fed70eb4c28:   jge    
> 0x00007fed70eb5388           ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
>                    │            │                                             
>                ; - java.lang.String::&lt;init&gt;@110 (line 537)
>   3.05%            │            │  0x00007fed70eb4c2e:   cmp    0x8(%rsp),%ecx
>                    │            │  0x00007fed70eb4c32:   jae    
> 0x00007fed70eb5319
>   2.38%            │            │  0x00007fed70eb4c38:   mov    %r8,(%rsp)
>   2.64%            │            │  0x00007fed70eb4c3c:   movslq %ecx,%r8
>   2.46%            │            │  0x00007fed70eb4c3f:   mov    %rax,%rbx
>   3.44%            │            │  0x00007fed70eb4c42:   sub    %r8,%rbx
>   2.62%            │            │  0x00007fed70eb4c45:   add    $0x1,%rbx
>   2.64%            │            │  0x00007fed70eb4c49:   and    
> $0xfffffffffffffffe,%rbx
>   2.30%            │            │  0x00007fed70eb4c4d:   mov    %ebx,%r8d
>   3.08%            │            │  0x00007fed70eb4c50:   add    %ecx,%r8d
>   2.55%            │            │  0x00007fed70eb4c53:   movslq %r8d,%r8
>   2.45%            │            │  0x00007fed70eb4c56:   add    
> $0xfffffffffffffffe,%r8
>   2.13%            │            │  0x00007fed70eb4c5a:   cmp    (%rsp),%r8
>                    │            │  0x00007fed70eb4c5e:   jae    
> 0x00007fed70eb5319
>   3.36%            │            │  0x00007fed70eb4c64:   mov    %ecx,%edi     
>                ;*aload_1 {reexecute=0 rethrow=0 return_oop=0}
>                    │            │                                             
>                ; - java.lang.String::&lt;init&gt;@113 (line 538)
>   2.86%            │           ↗│  0x00007fed70eb4c66:   movsbl 
> 0x10(%r14,%rdi,1),%r8d       ;*baload {reexecute=0 rethrow=0 return_oop=0}
>                    │           ││                                             
>                ; - java.lang.String::&lt;init&gt;@115 (line 538)
>   2.48%            │           ││  0x00007fed70eb4c6c:   mov    %r9d,%edx
>   2.26%            │           ││  0x00007fed70eb4c6f:   inc    %edx          
>                ;*iinc {reexecute=0 rethrow=0 return_oop=0}
>                    │           ││                                             
>                ; - java.lang.String::&lt;init&gt;@127 (line 540)
>   3.28%            │           ││  0x00007fed70eb4c71:   mov    %edi,%ebx
>   2.44%            │           ││  0x00007fed70eb4c73:   inc    %ebx          
>                ;*iinc {reexecute=0 rethrow=0 return_oop=0}
>                    │           ││                                             
>                ; - java.lang.String::&lt;init&gt;@134 (line 541)
>   2.35%            │           ││  0x00007fed70eb4c75:   test   %r8d,%r8d
>                    ╰           ││  0x00007fed70eb4c78:   jge    
> 0x00007fed70eb4c04           ;*iflt {reexecute=0 rethrow=0 return_oop=0}
>                                ││                                             
>                ; - java.lang.String::&lt;init&gt;@120 (line 539)
> 
> and this one is for patched code:
> 
>  17.28%           ││  0x00007f6b88eb6061:   mov    %edx,%r10d                 
>   ;*iload_2 {reexecute=0 rethrow=0 return_oop=0}
>                   ││                                                          
>   ; - java.lang.String::&lt;init&gt;@107 (line 537)
>   0.11%           ↘│  0x00007f6b88eb6064:   test   %r10d,%r10d
>                    │  0x00007f6b88eb6067:   jl     0x00007f6b88eb669c         
>   ;*iflt {reexecute=0 rethrow=0 return_oop=0}
>                    │                                                          
>   ; - java.lang.String::&lt;init&gt;@108 (line 537)
>   0.39%            │  0x00007f6b88eb606d:   cmp    %r13d,%r10d
>                    │  0x00007f6b88eb6070:   jge    0x00007f6b88eb66d0         
>   ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
>                    │                                                          
>   ; - java.lang.String::&lt;init&gt;@114 (line 537)
>   0.66%            │  0x00007f6b88eb6076:   mov    %ebx,%r9d
>  13.70%            │  0x00007f6b88eb6079:   cmp    0x8(%rsp),%r10d
>   0.01%            │  0x00007f6b88eb607e:   jae    0x00007f6b88eb6671
>   0.14%            │  0x00007f6b88eb6084:   movsbl 0x10(%r14,%r10,1),%edi     
>   ;*baload {reexecute=0 rethrow=0 return_oop=0}
>                    │                                                          
>   ; - java.lang.String::&lt;init&gt;@119 (line 538)
>   0.37%            │  0x00007f6b88eb608a:   mov    %r9d,%ebx
>   0.99%            │  0x00007f6b88eb608d:   inc    %ebx                       
>   ;*iinc {reexecute=0 rethrow=0 return_oop=0}
>                    │                                                          
>   ; - java.lang.String::&lt;init&gt;@131 (line 540)
>  12.88%            │  0x00007f6b88eb608f:   movslq %r9d,%rsi                  
>   ;*bastore {reexecute=0 rethrow=0 return_oop=0}
>                    │                                                          
>   ; - java.lang.String::&lt;init&gt;@196 (line 548)
>   0.17%            │  0x00007f6b88eb6092:   mov    %r10d,%edx
>   0.39%            │  0x00007f6b88eb6095:   inc    %edx                       
>   ;*iinc {reexecute=0 rethrow=0 return_oop=0}
>                    │                                                          
>   ; - java.lang.String::&lt;init&gt;@138 (line 541)
>   0.96%            │  0x00007f6b88eb6097:   test   %edi,%edi
>   0.02%            │  0x00007f6b88eb6099:   jl     0x00007f6b88eb60dc         
>   ;*iflt {reexecute=0 rethrow=0 return_oop=0}
> 
> Between instructions mapped to `if_icmpge` and `aload_1` in baseline we have 
> bounds check which is missing from patched code.

As soon as the proper fix is provided in 
https://github.com/openjdk/jdk/pull/7034 I close this PR.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6812

Reply via email to