Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v5]

2022-03-26 Thread Daniel Jeliński
On Wed, 23 Mar 2022 00:35:14 GMT, Xin Liu  wrote:

>> If AbstractStringBuilder only grow, the inflated value which has been 
>> encoded in UTF16 can't be compressed. 
>> toString() can skip compression in this case. This can save an 
>> ArrayAllocation in StringUTF16::compress().
>> 
>> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
>> 
>> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
>> capacity of StringBuilder is S in bytes. When it encounters the 1st 
>> character that can't be encoded in LATIN1, it inflates and allocate a new 
>> array of 2*S. `toString()` will try to compress that value so it need to 
>> allocate S bytes. The last step allocates 2*S bytes because it has to copy 
>> the string.  so it requires to allocate 5 * S bytes in total.  By skipping 
>> the failed compression, it only allocates 4 * S bytes.  that's 20%. In real 
>> execution, we observe 16% allocation reduction, tracked by JMH GC profiler 
>> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
>> allocations. 
>> 
>> Not only allocation drops, the runtime performance(ns/op) also increases 
>> from 3.34% to 18.91%. 
>> 
>> Before: 
>> 
>> $$make test 
>> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
>> $HOME/Devel/jdk_baseline/bin/java" 
>> 
>> Benchmark   
>> (MIXED_SIZE)  Mode  Cnt Score Error   Units
>> StringBuilders.toStringWithMixedChars
>> 128  avgt   15   649.846 ±  76.291   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 128  avgt   15   872.855 ± 128.259  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 128  avgt   15   880.121 ±   0.050B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 128  avgt   15   707.730 ± 194.421  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 128  avgt   15   706.602 ±  94.504B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 128  avgt   15 0.001 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 128  avgt   15 0.001 ±   0.001B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 128  avgt   15   113.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 128  avgt   1585.000ms
>> StringBuilders.toStringWithMixedChars
>> 256  avgt   15  1316.652 ± 112.771   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 256  avgt   15   800.864 ±  76.869  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 256  avgt   15  1648.288 ±   0.162B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 256  avgt   15   599.736 ± 174.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 256  avgt   15  1229.669 ± 318.518B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 256  avgt   15 0.001 ±   0.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 256  avgt   15 0.001 ±   0.002B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 256  avgt   15   133.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 256  avgt   1592.000ms
>> StringBuilders.toStringWithMixedChars
>>1024  avgt   15  5204.303 ± 418.115   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>>1024  avgt   15   768.730 ±  72.945  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>>1024  avgt   15  6256.844 ±   0.358B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>>1024  avgt   15   655.852 ± 121.602  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>>1024  avgt   15  5315.265 ± 578.878B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>>1024  avgt   15 0.002 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>>1024  avgt   15 0.014 ±   0.011B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>>1024  avgt   1596.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time  

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v4]

2022-03-24 Thread Xin Liu
On Tue, 22 Mar 2022 10:09:40 GMT, Claes Redestad  wrote:

>> Xin Liu has updated the pull request with a new target base due to a merge 
>> or a rebase. The incremental webrev excludes the unrelated changes brought 
>> in by the merge/rebase. The pull request contains six additional commits 
>> since the last revision:
>> 
>>  - Merge branch 'master' into JDK-8282429
>>  - split StringBuilder.toString() performance test out of StringBuilders.java
>>  - Change growOnly to maybeLatin.
>>
>>This patch also copys over the attribute from the other 
>> AbstractStringBuilder.
>>Add a unit test to cover methods which cause maybeLatin1 becomes true.
>>  - make sure String(StringBuffer) is still synchronized.
>>  - Add a microbenchmark.
>>  - 8282429:  StringBuilder/StringBuffer.toString() skip compressing for 
>> UTF16 strings
>
> Thanks for refactoring the micro to avoid the redundant run overheads, and 
> checking that the we're size neutral on all configurations. 
> 
> I have added a few minor comments inline that you can choose to address or 
> ignore. 
> 
> Good work!

hi, @cl4es  @djelinski , 
Could you take another look at the latest revision?  

thanks, 
--lx

-

PR: https://git.openjdk.java.net/jdk/pull/7671


Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v5]

2022-03-22 Thread Xin Liu
> If AbstractStringBuilder only grow, the inflated value which has been encoded 
> in UTF16 can't be compressed. 
> toString() can skip compression in this case. This can save an 
> ArrayAllocation in StringUTF16::compress().
> 
> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
> 
> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
> capacity of StringBuilder is S in bytes. When it encounters the 1st character 
> that can't be encoded in LATIN1, it inflates and allocate a new array of 2*S. 
> `toString()` will try to compress that value so it need to allocate S bytes. 
> The last step allocates 2*S bytes because it has to copy the string.  so it 
> requires to allocate 5 * S bytes in total.  By skipping the failed 
> compression, it only allocates 4 * S bytes.  that's 20%. In real execution, 
> we observe 16% allocation reduction, tracked by JMH GC profiler 
> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
> allocations. 
> 
> Not only allocation drops, the runtime performance(ns/op) also increases from 
> 3.34% to 18.91%. 
> 
> Before: 
> 
> $$make test 
> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
> $HOME/Devel/jdk_baseline/bin/java" 
> 
> Benchmark   
> (MIXED_SIZE)  Mode  Cnt Score Error   Units
> StringBuilders.toStringWithMixedChars 
>128  avgt   15   649.846 ±  76.291   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>128  avgt   15   872.855 ± 128.259  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>128  avgt   15   880.121 ±   0.050B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>128  avgt   15   707.730 ± 194.421  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>128  avgt   15   706.602 ±  94.504B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>128  avgt   15 0.001 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>128  avgt   15 0.001 ±   0.001B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>128  avgt   15   113.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>128  avgt   1585.000ms
> StringBuilders.toStringWithMixedChars 
>256  avgt   15  1316.652 ± 112.771   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>256  avgt   15   800.864 ±  76.869  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>256  avgt   15  1648.288 ±   0.162B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>256  avgt   15   599.736 ± 174.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>256  avgt   15  1229.669 ± 318.518B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>256  avgt   15 0.001 ±   0.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>256  avgt   15 0.001 ±   0.002B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>256  avgt   15   133.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>256  avgt   1592.000ms
> StringBuilders.toStringWithMixedChars 
>   1024  avgt   15  5204.303 ± 418.115   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>   1024  avgt   15   768.730 ±  72.945  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>   1024  avgt   15  6256.844 ±   0.358B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>   1024  avgt   15   655.852 ± 121.602  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>   1024  avgt   15  5315.265 ± 578.878B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>   1024  avgt   15 0.002 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>   1024  avgt   15 0.014 ±   0.011B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>   1024  avgt   1596.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>   1024  avgt   1586.000ms
> 
> 
> After
> 
> $make test 
> 

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v3]

2022-03-22 Thread Claes Redestad
On Tue, 22 Mar 2022 06:54:35 GMT, Xin Liu  wrote:

>> src/java.base/share/classes/java/lang/AbstractStringBuilder.java line 1008:
>> 
>>> 1006: this.count = newCount;
>>> 1007: putStringAt(start, str);
>>> 1008: if (end - start > 0) {
>> 
>> regardless of value of `end - start` you could also skip setting 
>> `maybeLatin1 = true` if:
>> - `str.coder() == UTF16`
>> - `this.coder == LATIN1`
>
> hi, @cl4es, 
> you are correct for this and the comment at setCharAt(), but I don't think 
> it's necessary to check all cases. this attribute is just like a hint. if 
> this.coder == LATIN1, that it doesn't matter if maybeLatin1 is true. 
> 
> if our attitude is checking all cases, it will become too complex and 
> error-prone. deleteCharAt() and setLength() also need to check.  if so, it 
> will pollute code more. I incline to set this attribute conservatively in all 
> deleting methods.

Right, we don't need to check every case and I agree with favoring simplicity 
at the expense of some false positives. Perhaps the `if (end - start > 0) {` 
test here isn't pulling its weight either and we should just unconditionally 
set `maybeLatin1 = true` even if we're not actually replacing anything (which 
is very much a corner-case).

-

PR: https://git.openjdk.java.net/jdk/pull/7671


Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v4]

2022-03-22 Thread Claes Redestad
On Tue, 22 Mar 2022 08:05:35 GMT, Xin Liu  wrote:

>> If AbstractStringBuilder only grow, the inflated value which has been 
>> encoded in UTF16 can't be compressed. 
>> toString() can skip compression in this case. This can save an 
>> ArrayAllocation in StringUTF16::compress().
>> 
>> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
>> 
>> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
>> capacity of StringBuilder is S in bytes. When it encounters the 1st 
>> character that can't be encoded in LATIN1, it inflates and allocate a new 
>> array of 2*S. `toString()` will try to compress that value so it need to 
>> allocate S bytes. The last step allocates 2*S bytes because it has to copy 
>> the string.  so it requires to allocate 5 * S bytes in total.  By skipping 
>> the failed compression, it only allocates 4 * S bytes.  that's 20%. In real 
>> execution, we observe 16% allocation reduction, tracked by JMH GC profiler 
>> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
>> allocations. 
>> 
>> Not only allocation drops, the runtime performance(ns/op) also increases 
>> from 3.34% to 18.91%. 
>> 
>> Before: 
>> 
>> $$make test 
>> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
>> $HOME/Devel/jdk_baseline/bin/java" 
>> 
>> Benchmark   
>> (MIXED_SIZE)  Mode  Cnt Score Error   Units
>> StringBuilders.toStringWithMixedChars
>> 128  avgt   15   649.846 ±  76.291   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 128  avgt   15   872.855 ± 128.259  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 128  avgt   15   880.121 ±   0.050B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 128  avgt   15   707.730 ± 194.421  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 128  avgt   15   706.602 ±  94.504B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 128  avgt   15 0.001 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 128  avgt   15 0.001 ±   0.001B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 128  avgt   15   113.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 128  avgt   1585.000ms
>> StringBuilders.toStringWithMixedChars
>> 256  avgt   15  1316.652 ± 112.771   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 256  avgt   15   800.864 ±  76.869  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 256  avgt   15  1648.288 ±   0.162B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 256  avgt   15   599.736 ± 174.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 256  avgt   15  1229.669 ± 318.518B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 256  avgt   15 0.001 ±   0.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 256  avgt   15 0.001 ±   0.002B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 256  avgt   15   133.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 256  avgt   1592.000ms
>> StringBuilders.toStringWithMixedChars
>>1024  avgt   15  5204.303 ± 418.115   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>>1024  avgt   15   768.730 ±  72.945  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>>1024  avgt   15  6256.844 ±   0.358B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>>1024  avgt   15   655.852 ± 121.602  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>>1024  avgt   15  5315.265 ± 578.878B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>>1024  avgt   15 0.002 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>>1024  avgt   15 0.014 ±   0.011B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>>1024  avgt   1596.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time  

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v4]

2022-03-22 Thread Xin Liu
> If AbstractStringBuilder only grow, the inflated value which has been encoded 
> in UTF16 can't be compressed. 
> toString() can skip compression in this case. This can save an 
> ArrayAllocation in StringUTF16::compress().
> 
> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
> 
> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
> capacity of StringBuilder is S in bytes. When it encounters the 1st character 
> that can't be encoded in LATIN1, it inflates and allocate a new array of 2*S. 
> `toString()` will try to compress that value so it need to allocate S bytes. 
> The last step allocates 2*S bytes because it has to copy the string.  so it 
> requires to allocate 5 * S bytes in total.  By skipping the failed 
> compression, it only allocates 4 * S bytes.  that's 20%. In real execution, 
> we observe 16% allocation reduction, tracked by JMH GC profiler 
> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
> allocations. 
> 
> Not only allocation drops, the runtime performance(ns/op) also increases from 
> 3.34% to 18.91%. 
> 
> Before: 
> 
> $$make test 
> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
> $HOME/Devel/jdk_baseline/bin/java" 
> 
> Benchmark   
> (MIXED_SIZE)  Mode  Cnt Score Error   Units
> StringBuilders.toStringWithMixedChars 
>128  avgt   15   649.846 ±  76.291   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>128  avgt   15   872.855 ± 128.259  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>128  avgt   15   880.121 ±   0.050B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>128  avgt   15   707.730 ± 194.421  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>128  avgt   15   706.602 ±  94.504B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>128  avgt   15 0.001 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>128  avgt   15 0.001 ±   0.001B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>128  avgt   15   113.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>128  avgt   1585.000ms
> StringBuilders.toStringWithMixedChars 
>256  avgt   15  1316.652 ± 112.771   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>256  avgt   15   800.864 ±  76.869  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>256  avgt   15  1648.288 ±   0.162B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>256  avgt   15   599.736 ± 174.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>256  avgt   15  1229.669 ± 318.518B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>256  avgt   15 0.001 ±   0.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>256  avgt   15 0.001 ±   0.002B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>256  avgt   15   133.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>256  avgt   1592.000ms
> StringBuilders.toStringWithMixedChars 
>   1024  avgt   15  5204.303 ± 418.115   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>   1024  avgt   15   768.730 ±  72.945  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>   1024  avgt   15  6256.844 ±   0.358B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>   1024  avgt   15   655.852 ± 121.602  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>   1024  avgt   15  5315.265 ± 578.878B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>   1024  avgt   15 0.002 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>   1024  avgt   15 0.014 ±   0.011B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>   1024  avgt   1596.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>   1024  avgt   1586.000ms
> 
> 
> After
> 
> $make test 
> 

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v3]

2022-03-22 Thread Xin Liu
On Tue, 15 Mar 2022 23:25:17 GMT, Claes Redestad  wrote:

>> Xin Liu has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Change growOnly to maybeLatin.
>>   
>>   This patch also copys over the attribute from the other 
>> AbstractStringBuilder.
>>   Add a unit test to cover methods which cause maybeLatin1 becomes true.
>
> src/java.base/share/classes/java/lang/AbstractStringBuilder.java line 1008:
> 
>> 1006: this.count = newCount;
>> 1007: putStringAt(start, str);
>> 1008: if (end - start > 0) {
> 
> regardless of value of `end - start` you could also skip setting `maybeLatin1 
> = true` if:
> - `str.coder() == UTF16`
> - `this.coder == LATIN1`

hi, @cl4es, 
you are correct for this and the comment at setCharAt(), but I don't think it's 
necessary to check all cases. this attribute is just like a hint. if this.coder 
== LATIN1, that it doesn't matter if maybeLatin1 is true. 

if our attitude is checking all cases, it will become too complex and 
error-prone. deleteCharAt() and setLength() also need to check.  if so, it will 
pollute code more. I incline to set this attribute conservatively in all 
deleting methods.

-

PR: https://git.openjdk.java.net/jdk/pull/7671


Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v3]

2022-03-21 Thread Xin Liu
On Wed, 9 Mar 2022 08:33:36 GMT, Xin Liu  wrote:

>> If AbstractStringBuilder only grow, the inflated value which has been 
>> encoded in UTF16 can't be compressed. 
>> toString() can skip compression in this case. This can save an 
>> ArrayAllocation in StringUTF16::compress().
>> 
>> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
>> 
>> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
>> capacity of StringBuilder is S in bytes. When it encounters the 1st 
>> character that can't be encoded in LATIN1, it inflates and allocate a new 
>> array of 2*S. `toString()` will try to compress that value so it need to 
>> allocate S bytes. The last step allocates 2*S bytes because it has to copy 
>> the string.  so it requires to allocate 5 * S bytes in total.  By skipping 
>> the failed compression, it only allocates 4 * S bytes.  that's 20%. In real 
>> execution, we observe 16% allocation reduction, tracked by JMH GC profiler 
>> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
>> allocations. 
>> 
>> Not only allocation drops, the runtime performance(ns/op) also increases 
>> from 3.34% to 18.91%. 
>> 
>> Before: 
>> 
>> $$make test 
>> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
>> $HOME/Devel/jdk_baseline/bin/java" 
>> 
>> Benchmark   
>> (MIXED_SIZE)  Mode  Cnt Score Error   Units
>> StringBuilders.toStringWithMixedChars
>> 128  avgt   15   649.846 ±  76.291   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 128  avgt   15   872.855 ± 128.259  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 128  avgt   15   880.121 ±   0.050B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 128  avgt   15   707.730 ± 194.421  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 128  avgt   15   706.602 ±  94.504B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 128  avgt   15 0.001 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 128  avgt   15 0.001 ±   0.001B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 128  avgt   15   113.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 128  avgt   1585.000ms
>> StringBuilders.toStringWithMixedChars
>> 256  avgt   15  1316.652 ± 112.771   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 256  avgt   15   800.864 ±  76.869  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 256  avgt   15  1648.288 ±   0.162B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 256  avgt   15   599.736 ± 174.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 256  avgt   15  1229.669 ± 318.518B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 256  avgt   15 0.001 ±   0.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 256  avgt   15 0.001 ±   0.002B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 256  avgt   15   133.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 256  avgt   1592.000ms
>> StringBuilders.toStringWithMixedChars
>>1024  avgt   15  5204.303 ± 418.115   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>>1024  avgt   15   768.730 ±  72.945  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>>1024  avgt   15  6256.844 ±   0.358B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>>1024  avgt   15   655.852 ± 121.602  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>>1024  avgt   15  5315.265 ± 578.878B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>>1024  avgt   15 0.002 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>>1024  avgt   15 0.014 ±   0.011B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>>1024  avgt   1596.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v3]

2022-03-15 Thread Claes Redestad
On Wed, 9 Mar 2022 08:33:36 GMT, Xin Liu  wrote:

>> If AbstractStringBuilder only grow, the inflated value which has been 
>> encoded in UTF16 can't be compressed. 
>> toString() can skip compression in this case. This can save an 
>> ArrayAllocation in StringUTF16::compress().
>> 
>> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
>> 
>> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
>> capacity of StringBuilder is S in bytes. When it encounters the 1st 
>> character that can't be encoded in LATIN1, it inflates and allocate a new 
>> array of 2*S. `toString()` will try to compress that value so it need to 
>> allocate S bytes. The last step allocates 2*S bytes because it has to copy 
>> the string.  so it requires to allocate 5 * S bytes in total.  By skipping 
>> the failed compression, it only allocates 4 * S bytes.  that's 20%. In real 
>> execution, we observe 16% allocation reduction, tracked by JMH GC profiler 
>> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
>> allocations. 
>> 
>> Not only allocation drops, the runtime performance(ns/op) also increases 
>> from 3.34% to 18.91%. 
>> 
>> Before: 
>> 
>> $$make test 
>> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
>> $HOME/Devel/jdk_baseline/bin/java" 
>> 
>> Benchmark   
>> (MIXED_SIZE)  Mode  Cnt Score Error   Units
>> StringBuilders.toStringWithMixedChars
>> 128  avgt   15   649.846 ±  76.291   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 128  avgt   15   872.855 ± 128.259  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 128  avgt   15   880.121 ±   0.050B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 128  avgt   15   707.730 ± 194.421  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 128  avgt   15   706.602 ±  94.504B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 128  avgt   15 0.001 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 128  avgt   15 0.001 ±   0.001B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 128  avgt   15   113.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 128  avgt   1585.000ms
>> StringBuilders.toStringWithMixedChars
>> 256  avgt   15  1316.652 ± 112.771   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 256  avgt   15   800.864 ±  76.869  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 256  avgt   15  1648.288 ±   0.162B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 256  avgt   15   599.736 ± 174.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 256  avgt   15  1229.669 ± 318.518B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 256  avgt   15 0.001 ±   0.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 256  avgt   15 0.001 ±   0.002B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 256  avgt   15   133.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 256  avgt   1592.000ms
>> StringBuilders.toStringWithMixedChars
>>1024  avgt   15  5204.303 ± 418.115   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>>1024  avgt   15   768.730 ±  72.945  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>>1024  avgt   15  6256.844 ±   0.358B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>>1024  avgt   15   655.852 ± 121.602  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>>1024  avgt   15  5315.265 ± 578.878B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>>1024  avgt   15 0.002 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>>1024  avgt   15 0.014 ±   0.011B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>>1024  avgt   1596.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v3]

2022-03-15 Thread Xin Liu
On Wed, 9 Mar 2022 08:33:36 GMT, Xin Liu  wrote:

>> If AbstractStringBuilder only grow, the inflated value which has been 
>> encoded in UTF16 can't be compressed. 
>> toString() can skip compression in this case. This can save an 
>> ArrayAllocation in StringUTF16::compress().
>> 
>> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
>> 
>> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
>> capacity of StringBuilder is S in bytes. When it encounters the 1st 
>> character that can't be encoded in LATIN1, it inflates and allocate a new 
>> array of 2*S. `toString()` will try to compress that value so it need to 
>> allocate S bytes. The last step allocates 2*S bytes because it has to copy 
>> the string.  so it requires to allocate 5 * S bytes in total.  By skipping 
>> the failed compression, it only allocates 4 * S bytes.  that's 20%. In real 
>> execution, we observe 16% allocation reduction, tracked by JMH GC profiler 
>> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
>> allocations. 
>> 
>> Not only allocation drops, the runtime performance(ns/op) also increases 
>> from 3.34% to 18.91%. 
>> 
>> Before: 
>> 
>> $$make test 
>> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
>> $HOME/Devel/jdk_baseline/bin/java" 
>> 
>> Benchmark   
>> (MIXED_SIZE)  Mode  Cnt Score Error   Units
>> StringBuilders.toStringWithMixedChars
>> 128  avgt   15   649.846 ±  76.291   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 128  avgt   15   872.855 ± 128.259  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 128  avgt   15   880.121 ±   0.050B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 128  avgt   15   707.730 ± 194.421  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 128  avgt   15   706.602 ±  94.504B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 128  avgt   15 0.001 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 128  avgt   15 0.001 ±   0.001B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 128  avgt   15   113.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 128  avgt   1585.000ms
>> StringBuilders.toStringWithMixedChars
>> 256  avgt   15  1316.652 ± 112.771   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 256  avgt   15   800.864 ±  76.869  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 256  avgt   15  1648.288 ±   0.162B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 256  avgt   15   599.736 ± 174.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 256  avgt   15  1229.669 ± 318.518B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 256  avgt   15 0.001 ±   0.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 256  avgt   15 0.001 ±   0.002B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 256  avgt   15   133.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 256  avgt   1592.000ms
>> StringBuilders.toStringWithMixedChars
>>1024  avgt   15  5204.303 ± 418.115   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>>1024  avgt   15   768.730 ±  72.945  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>>1024  avgt   15  6256.844 ±   0.358B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>>1024  avgt   15   655.852 ± 121.602  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>>1024  avgt   15  5315.265 ± 578.878B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>>1024  avgt   15 0.002 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>>1024  avgt   15 0.014 ±   0.011B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>>1024  avgt   1596.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v3]

2022-03-09 Thread Daniel Jeliński
On Wed, 9 Mar 2022 08:33:36 GMT, Xin Liu  wrote:

>> If AbstractStringBuilder only grow, the inflated value which has been 
>> encoded in UTF16 can't be compressed. 
>> toString() can skip compression in this case. This can save an 
>> ArrayAllocation in StringUTF16::compress().
>> 
>> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
>> 
>> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
>> capacity of StringBuilder is S in bytes. When it encounters the 1st 
>> character that can't be encoded in LATIN1, it inflates and allocate a new 
>> array of 2*S. `toString()` will try to compress that value so it need to 
>> allocate S bytes. The last step allocates 2*S bytes because it has to copy 
>> the string.  so it requires to allocate 5 * S bytes in total.  By skipping 
>> the failed compression, it only allocates 4 * S bytes.  that's 20%. In real 
>> execution, we observe 16% allocation reduction, tracked by JMH GC profiler 
>> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
>> allocations. 
>> 
>> Not only allocation drops, the runtime performance(ns/op) also increases 
>> from 3.34% to 18.91%. 
>> 
>> Before: 
>> 
>> $$make test 
>> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
>> $HOME/Devel/jdk_baseline/bin/java" 
>> 
>> Benchmark   
>> (MIXED_SIZE)  Mode  Cnt Score Error   Units
>> StringBuilders.toStringWithMixedChars
>> 128  avgt   15   649.846 ±  76.291   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 128  avgt   15   872.855 ± 128.259  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 128  avgt   15   880.121 ±   0.050B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 128  avgt   15   707.730 ± 194.421  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 128  avgt   15   706.602 ±  94.504B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 128  avgt   15 0.001 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 128  avgt   15 0.001 ±   0.001B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 128  avgt   15   113.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 128  avgt   1585.000ms
>> StringBuilders.toStringWithMixedChars
>> 256  avgt   15  1316.652 ± 112.771   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 256  avgt   15   800.864 ±  76.869  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 256  avgt   15  1648.288 ±   0.162B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 256  avgt   15   599.736 ± 174.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 256  avgt   15  1229.669 ± 318.518B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 256  avgt   15 0.001 ±   0.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 256  avgt   15 0.001 ±   0.002B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 256  avgt   15   133.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 256  avgt   1592.000ms
>> StringBuilders.toStringWithMixedChars
>>1024  avgt   15  5204.303 ± 418.115   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>>1024  avgt   15   768.730 ±  72.945  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>>1024  avgt   15  6256.844 ±   0.358B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>>1024  avgt   15   655.852 ± 121.602  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>>1024  avgt   15  5315.265 ± 578.878B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>>1024  avgt   15 0.002 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>>1024  avgt   15 0.014 ±   0.011B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>>1024  avgt   1596.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v3]

2022-03-09 Thread Xin Liu
> If AbstractStringBuilder only grow, the inflated value which has been encoded 
> in UTF16 can't be compressed. 
> toString() can skip compression in this case. This can save an 
> ArrayAllocation in StringUTF16::compress().
> 
> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
> 
> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
> capacity of StringBuilder is S in bytes. When it encounters the 1st character 
> that can't be encoded in LATIN1, it inflates and allocate a new array of 2*S. 
> `toString()` will try to compress that value so it need to allocate S bytes. 
> The last step allocates 2*S bytes because it has to copy the string.  so it 
> requires to allocate 5 * S bytes in total.  By skipping the failed 
> compression, it only allocates 4 * S bytes.  that's 20%. In real execution, 
> we observe 16% allocation reduction, tracked by JMH GC profiler 
> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
> allocations. 
> 
> Not only allocation drops, the runtime performance(ns/op) also increases from 
> 3.34% to 18.91%. 
> 
> Before: 
> 
> $$make test 
> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
> $HOME/Devel/jdk_baseline/bin/java" 
> 
> Benchmark   
> (MIXED_SIZE)  Mode  Cnt Score Error   Units
> StringBuilders.toStringWithMixedChars 
>128  avgt   15   649.846 ±  76.291   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>128  avgt   15   872.855 ± 128.259  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>128  avgt   15   880.121 ±   0.050B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>128  avgt   15   707.730 ± 194.421  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>128  avgt   15   706.602 ±  94.504B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>128  avgt   15 0.001 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>128  avgt   15 0.001 ±   0.001B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>128  avgt   15   113.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>128  avgt   1585.000ms
> StringBuilders.toStringWithMixedChars 
>256  avgt   15  1316.652 ± 112.771   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>256  avgt   15   800.864 ±  76.869  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>256  avgt   15  1648.288 ±   0.162B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>256  avgt   15   599.736 ± 174.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>256  avgt   15  1229.669 ± 318.518B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>256  avgt   15 0.001 ±   0.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>256  avgt   15 0.001 ±   0.002B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>256  avgt   15   133.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>256  avgt   1592.000ms
> StringBuilders.toStringWithMixedChars 
>   1024  avgt   15  5204.303 ± 418.115   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>   1024  avgt   15   768.730 ±  72.945  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>   1024  avgt   15  6256.844 ±   0.358B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>   1024  avgt   15   655.852 ± 121.602  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>   1024  avgt   15  5315.265 ± 578.878B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>   1024  avgt   15 0.002 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>   1024  avgt   15 0.014 ±   0.011B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>   1024  avgt   1596.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>   1024  avgt   1586.000ms
> 
> 
> After
> 
> $make test 
> 

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v2]

2022-03-08 Thread Xin Liu
On Tue, 8 Mar 2022 17:33:58 GMT, Daniel Jeliński  wrote:

> As for a better name for `growOnly`, something like `mayBeLatin` would better 
> convey the variable's purpose. What do you think?

There are tricky. I need to add test to cover them.  

The problem comes from that this patch fails to copy over the attribute 
'growOnly/maybeLatin1' from the other AbstractStringBuilder.  I think we can 
fix this loophole. Other types such as String are well-formed. they don't 
suffer from this issue.

-

PR: https://git.openjdk.java.net/jdk/pull/7671


Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v2]

2022-03-08 Thread Daniel Jeliński
On Mon, 7 Mar 2022 23:36:41 GMT, Xin Liu  wrote:

>> If AbstractStringBuilder only grow, the inflated value which has been 
>> encoded in UTF16 can't be compressed. 
>> toString() can skip compression in this case. This can save an 
>> ArrayAllocation in StringUTF16::compress().
>> 
>> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
>> 
>> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
>> capacity of StringBuilder is S in bytes. When it encounters the 1st 
>> character that can't be encoded in LATIN1, it inflates and allocate a new 
>> array of 2*S. `toString()` will try to compress that value so it need to 
>> allocate S bytes. The last step allocates 2*S bytes because it has to copy 
>> the string.  so it requires to allocate 5 * S bytes in total.  By skipping 
>> the failed compression, it only allocates 4 * S bytes.  that's 20%. In real 
>> execution, we observe 16% allocation reduction, tracked by JMH GC profiler 
>> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
>> allocations. 
>> 
>> Not only allocation drops, the runtime performance(ns/op) also increases 
>> from 3.34% to 18.91%. 
>> 
>> Before: 
>> 
>> $$make test 
>> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
>> $HOME/Devel/jdk_baseline/bin/java" 
>> 
>> Benchmark   
>> (MIXED_SIZE)  Mode  Cnt Score Error   Units
>> StringBuilders.toStringWithMixedChars
>> 128  avgt   15   649.846 ±  76.291   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 128  avgt   15   872.855 ± 128.259  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 128  avgt   15   880.121 ±   0.050B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 128  avgt   15   707.730 ± 194.421  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 128  avgt   15   706.602 ±  94.504B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 128  avgt   15 0.001 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 128  avgt   15 0.001 ±   0.001B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 128  avgt   15   113.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 128  avgt   1585.000ms
>> StringBuilders.toStringWithMixedChars
>> 256  avgt   15  1316.652 ± 112.771   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 256  avgt   15   800.864 ±  76.869  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 256  avgt   15  1648.288 ±   0.162B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 256  avgt   15   599.736 ± 174.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 256  avgt   15  1229.669 ± 318.518B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 256  avgt   15 0.001 ±   0.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 256  avgt   15 0.001 ±   0.002B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 256  avgt   15   133.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 256  avgt   1592.000ms
>> StringBuilders.toStringWithMixedChars
>>1024  avgt   15  5204.303 ± 418.115   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>>1024  avgt   15   768.730 ±  72.945  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>>1024  avgt   15  6256.844 ±   0.358B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>>1024  avgt   15   655.852 ± 121.602  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>>1024  avgt   15  5315.265 ± 578.878B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>>1024  avgt   15 0.002 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>>1024  avgt   15 0.014 ±   0.011B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>>1024  avgt   1596.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v2]

2022-03-07 Thread Xin Liu
On Mon, 7 Mar 2022 12:10:51 GMT, Daniel Jeliński  wrote:

>> Xin Liu has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   make sure String(StringBuffer) is still synchronized.
>
> src/java.base/share/classes/java/lang/String.java line 1446:
> 
>> 1444:  */
>> 1445: public String(StringBuffer buffer) {
>> 1446: this(buffer, null);
> 
> This method is no longer synchronized on StringBuffer

you're right. fixed in revision#2. 
this could be very tricky to discover. thanks for the head-up!

-

PR: https://git.openjdk.java.net/jdk/pull/7671


Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v2]

2022-03-07 Thread Xin Liu
> If AbstractStringBuilder only grow, the inflated value which has been encoded 
> in UTF16 can't be compressed. 
> toString() can skip compression in this case. This can save an 
> ArrayAllocation in StringUTF16::compress().
> 
> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
> 
> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
> capacity of StringBuilder is S in bytes. When it encounters the 1st character 
> that can't be encoded in LATIN1, it inflates and allocate a new array of 2*S. 
> `toString()` will try to compress that value so it need to allocate S bytes. 
> The last step allocates 2*S bytes because it has to copy the string.  so it 
> requires to allocate 5 * S bytes in total.  By skipping the failed 
> compression, it only allocates 4 * S bytes.  that's 20%. In real execution, 
> we observe 16% allocation reduction, tracked by JMH GC profiler 
> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
> allocations. 
> 
> Not only allocation drops, the runtime performance(ns/op) also increases from 
> 3.34% to 18.91%. 
> 
> Before: 
> 
> $$make test 
> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
> $HOME/Devel/jdk_baseline/bin/java" 
> 
> Benchmark   
> (MIXED_SIZE)  Mode  Cnt Score Error   Units
> StringBuilders.toStringWithMixedChars 
>128  avgt   15   649.846 ±  76.291   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>128  avgt   15   872.855 ± 128.259  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>128  avgt   15   880.121 ±   0.050B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>128  avgt   15   707.730 ± 194.421  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>128  avgt   15   706.602 ±  94.504B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>128  avgt   15 0.001 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>128  avgt   15 0.001 ±   0.001B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>128  avgt   15   113.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>128  avgt   1585.000ms
> StringBuilders.toStringWithMixedChars 
>256  avgt   15  1316.652 ± 112.771   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>256  avgt   15   800.864 ±  76.869  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>256  avgt   15  1648.288 ±   0.162B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>256  avgt   15   599.736 ± 174.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>256  avgt   15  1229.669 ± 318.518B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>256  avgt   15 0.001 ±   0.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>256  avgt   15 0.001 ±   0.002B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>256  avgt   15   133.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>256  avgt   1592.000ms
> StringBuilders.toStringWithMixedChars 
>   1024  avgt   15  5204.303 ± 418.115   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>   1024  avgt   15   768.730 ±  72.945  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>   1024  avgt   15  6256.844 ±   0.358B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>   1024  avgt   15   655.852 ± 121.602  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>   1024  avgt   15  5315.265 ± 578.878B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>   1024  avgt   15 0.002 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>   1024  avgt   15 0.014 ±   0.011B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>   1024  avgt   1596.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>   1024  avgt   1586.000ms
> 
> 
> After
> 
> $make test 
> 

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings

2022-03-07 Thread Daniel Jeliński
On Thu, 3 Mar 2022 02:36:58 GMT, Xin Liu  wrote:

> If AbstractStringBuilder only grow, the inflated value which has been encoded 
> in UTF16 can't be compressed. 
> toString() can skip compression in this case. This can save an 
> ArrayAllocation in StringUTF16::compress().
> 
> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
> 
> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
> capacity of StringBuilder is S in bytes. When it encounters the 1st character 
> that can't be encoded in LATIN1, it inflates and allocate a new array of 2*S. 
> `toString()` will try to compress that value so it need to allocate S bytes. 
> The last step allocates 2*S bytes because it has to copy the string.  so it 
> requires to allocate 5 * S bytes in total.  By skipping the failed 
> compression, it only allocates 4 * S bytes.  that's 20%. In real execution, 
> we observe 16% allocation reduction, tracked by JMH GC profiler 
> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
> allocations. 
> 
> Not only allocation drops, the runtime performance(ns/op) also increases from 
> 3.34% to 18.91%. 
> 
> Before: 
> 
> $$make test 
> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
> $HOME/Devel/jdk_baseline/bin/java" 
> 
> Benchmark   
> (MIXED_SIZE)  Mode  Cnt Score Error   Units
> StringBuilders.toStringWithMixedChars 
>128  avgt   15   649.846 ±  76.291   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>128  avgt   15   872.855 ± 128.259  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>128  avgt   15   880.121 ±   0.050B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>128  avgt   15   707.730 ± 194.421  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>128  avgt   15   706.602 ±  94.504B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>128  avgt   15 0.001 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>128  avgt   15 0.001 ±   0.001B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>128  avgt   15   113.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>128  avgt   1585.000ms
> StringBuilders.toStringWithMixedChars 
>256  avgt   15  1316.652 ± 112.771   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>256  avgt   15   800.864 ±  76.869  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>256  avgt   15  1648.288 ±   0.162B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>256  avgt   15   599.736 ± 174.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>256  avgt   15  1229.669 ± 318.518B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>256  avgt   15 0.001 ±   0.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>256  avgt   15 0.001 ±   0.002B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>256  avgt   15   133.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>256  avgt   1592.000ms
> StringBuilders.toStringWithMixedChars 
>   1024  avgt   15  5204.303 ± 418.115   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>   1024  avgt   15   768.730 ±  72.945  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>   1024  avgt   15  6256.844 ±   0.358B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>   1024  avgt   15   655.852 ± 121.602  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>   1024  avgt   15  5315.265 ± 578.878B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>   1024  avgt   15 0.002 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>   1024  avgt   15 0.014 ±   0.011B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>   1024  avgt   1596.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>   1024  avgt   1586.000ms
> 
> 
> 

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings

2022-03-03 Thread Xin Liu
On Thu, 3 Mar 2022 08:11:19 GMT, Daniel Jeliński  wrote:

>> If AbstractStringBuilder only grow, the inflated value which has been 
>> encoded in UTF16 can't be compressed. 
>> toString() can skip compression in this case. This can save an 
>> ArrayAllocation in StringUTF16::compress().
>> 
>> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
>> 
>> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
>> capacity of StringBuilder is S in bytes. When it encounters the 1st 
>> character that can't be encoded in LATIN1, it inflates and allocate a new 
>> array of 2*S. `toString()` will try to compress that value so it need to 
>> allocate S bytes. The last step allocates 2*S bytes because it has to copy 
>> the string.  so it requires to allocate 5 * S bytes in total.  By skipping 
>> the failed compression, it only allocates 4 * S bytes.  that's 20%. In real 
>> execution, we observe 16% allocation reduction, tracked by JMH GC profiler 
>> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
>> allocations. 
>> 
>> Not only allocation drops, the runtime performance(ns/op) also increases 
>> from 3.34% to 18.91%. 
>> 
>> Before: 
>> 
>> $$make test 
>> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
>> $HOME/Devel/jdk_baseline/bin/java" 
>> 
>> Benchmark   
>> (MIXED_SIZE)  Mode  Cnt Score Error   Units
>> StringBuilders.toStringWithMixedChars
>> 128  avgt   15   649.846 ±  76.291   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 128  avgt   15   872.855 ± 128.259  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 128  avgt   15   880.121 ±   0.050B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 128  avgt   15   707.730 ± 194.421  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 128  avgt   15   706.602 ±  94.504B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 128  avgt   15 0.001 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 128  avgt   15 0.001 ±   0.001B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 128  avgt   15   113.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 128  avgt   1585.000ms
>> StringBuilders.toStringWithMixedChars
>> 256  avgt   15  1316.652 ± 112.771   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 256  avgt   15   800.864 ±  76.869  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 256  avgt   15  1648.288 ±   0.162B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 256  avgt   15   599.736 ± 174.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 256  avgt   15  1229.669 ± 318.518B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 256  avgt   15 0.001 ±   0.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 256  avgt   15 0.001 ±   0.002B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 256  avgt   15   133.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 256  avgt   1592.000ms
>> StringBuilders.toStringWithMixedChars
>>1024  avgt   15  5204.303 ± 418.115   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>>1024  avgt   15   768.730 ±  72.945  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>>1024  avgt   15  6256.844 ±   0.358B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>>1024  avgt   15   655.852 ± 121.602  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>>1024  avgt   15  5315.265 ± 578.878B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>>1024  avgt   15 0.002 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>>1024  avgt   15 0.014 ±   0.011B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>>1024  avgt   1596.000counts
>> 

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings

2022-03-03 Thread Xin Liu
On Thu, 3 Mar 2022 08:07:53 GMT, Daniel Jeliński  wrote:

>> If AbstractStringBuilder only grow, the inflated value which has been 
>> encoded in UTF16 can't be compressed. 
>> toString() can skip compression in this case. This can save an 
>> ArrayAllocation in StringUTF16::compress().
>> 
>> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
>> 
>> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
>> capacity of StringBuilder is S in bytes. When it encounters the 1st 
>> character that can't be encoded in LATIN1, it inflates and allocate a new 
>> array of 2*S. `toString()` will try to compress that value so it need to 
>> allocate S bytes. The last step allocates 2*S bytes because it has to copy 
>> the string.  so it requires to allocate 5 * S bytes in total.  By skipping 
>> the failed compression, it only allocates 4 * S bytes.  that's 20%. In real 
>> execution, we observe 16% allocation reduction, tracked by JMH GC profiler 
>> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
>> allocations. 
>> 
>> Not only allocation drops, the runtime performance(ns/op) also increases 
>> from 3.34% to 18.91%. 
>> 
>> Before: 
>> 
>> $$make test 
>> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
>> $HOME/Devel/jdk_baseline/bin/java" 
>> 
>> Benchmark   
>> (MIXED_SIZE)  Mode  Cnt Score Error   Units
>> StringBuilders.toStringWithMixedChars
>> 128  avgt   15   649.846 ±  76.291   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 128  avgt   15   872.855 ± 128.259  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 128  avgt   15   880.121 ±   0.050B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 128  avgt   15   707.730 ± 194.421  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 128  avgt   15   706.602 ±  94.504B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 128  avgt   15 0.001 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 128  avgt   15 0.001 ±   0.001B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 128  avgt   15   113.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 128  avgt   1585.000ms
>> StringBuilders.toStringWithMixedChars
>> 256  avgt   15  1316.652 ± 112.771   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>> 256  avgt   15   800.864 ±  76.869  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>> 256  avgt   15  1648.288 ±   0.162B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>> 256  avgt   15   599.736 ± 174.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>> 256  avgt   15  1229.669 ± 318.518B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>> 256  avgt   15 0.001 ±   0.001  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>> 256  avgt   15 0.001 ±   0.002B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>> 256  avgt   15   133.000counts
>> StringBuilders.toStringWithMixedChars:·gc.time   
>> 256  avgt   1592.000ms
>> StringBuilders.toStringWithMixedChars
>>1024  avgt   15  5204.303 ± 418.115   ns/op
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate 
>>1024  avgt   15   768.730 ±  72.945  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm
>>1024  avgt   15  6256.844 ±   0.358B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space
>>1024  avgt   15   655.852 ± 121.602  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm   
>>1024  avgt   15  5315.265 ± 578.878B/op
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space
>>1024  avgt   15 0.002 ±   0.002  MB/sec
>> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm   
>>1024  avgt   15 0.014 ±   0.011B/op
>> StringBuilders.toStringWithMixedChars:·gc.count  
>>1024  avgt   1596.000counts
>> 

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings

2022-03-03 Thread Daniel Jeliński
On Thu, 3 Mar 2022 02:36:58 GMT, Xin Liu  wrote:

> If AbstractStringBuilder only grow, the inflated value which has been encoded 
> in UTF16 can't be compressed. 
> toString() can skip compression in this case. This can save an 
> ArrayAllocation in StringUTF16::compress().
> 
> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
> 
> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
> capacity of StringBuilder is S in bytes. When it encounters the 1st character 
> that can't be encoded in LATIN1, it inflates and allocate a new array of 2*S. 
> `toString()` will try to compress that value so it need to allocate S bytes. 
> The last step allocates 2*S bytes because it has to copy the string.  so it 
> requires to allocate 5 * S bytes in total.  By skipping the failed 
> compression, it only allocates 4 * S bytes.  that's 20%. In real execution, 
> we observe 16% allocation reduction, tracked by JMH GC profiler 
> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
> allocations. 
> 
> Not only allocation drops, the runtime performance(ns/op) also increases from 
> 3.34% to 18.91%. 
> 
> Before: 
> 
> $$make test 
> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
> $HOME/Devel/jdk_baseline/bin/java" 
> 
> Benchmark   
> (MIXED_SIZE)  Mode  Cnt Score Error   Units
> StringBuilders.toStringWithMixedChars 
>128  avgt   15   649.846 ±  76.291   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>128  avgt   15   872.855 ± 128.259  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>128  avgt   15   880.121 ±   0.050B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>128  avgt   15   707.730 ± 194.421  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>128  avgt   15   706.602 ±  94.504B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>128  avgt   15 0.001 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>128  avgt   15 0.001 ±   0.001B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>128  avgt   15   113.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>128  avgt   1585.000ms
> StringBuilders.toStringWithMixedChars 
>256  avgt   15  1316.652 ± 112.771   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>256  avgt   15   800.864 ±  76.869  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>256  avgt   15  1648.288 ±   0.162B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>256  avgt   15   599.736 ± 174.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>256  avgt   15  1229.669 ± 318.518B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>256  avgt   15 0.001 ±   0.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>256  avgt   15 0.001 ±   0.002B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>256  avgt   15   133.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>256  avgt   1592.000ms
> StringBuilders.toStringWithMixedChars 
>   1024  avgt   15  5204.303 ± 418.115   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>   1024  avgt   15   768.730 ±  72.945  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>   1024  avgt   15  6256.844 ±   0.358B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>   1024  avgt   15   655.852 ± 121.602  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>   1024  avgt   15  5315.265 ± 578.878B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>   1024  avgt   15 0.002 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>   1024  avgt   15 0.014 ±   0.011B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>   1024  avgt   1596.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>   1024  avgt   1586.000ms
> 
> 
> 

Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings

2022-03-03 Thread Daniel Jeliński
On Thu, 3 Mar 2022 02:36:58 GMT, Xin Liu  wrote:

> If AbstractStringBuilder only grow, the inflated value which has been encoded 
> in UTF16 can't be compressed. 
> toString() can skip compression in this case. This can save an 
> ArrayAllocation in StringUTF16::compress().
> 
> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. 
> 
> In microbench, we expect to see that allocation/op reduces 20%.  The initial 
> capacity of StringBuilder is S in bytes. When it encounters the 1st character 
> that can't be encoded in LATIN1, it inflates and allocate a new array of 2*S. 
> `toString()` will try to compress that value so it need to allocate S bytes. 
> The last step allocates 2*S bytes because it has to copy the string.  so it 
> requires to allocate 5 * S bytes in total.  By skipping the failed 
> compression, it only allocates 4 * S bytes.  that's 20%. In real execution, 
> we observe 16% allocation reduction, tracked by JMH GC profiler 
> `gc.alloc.rate.norm `.  I think it's because HotSpot can't track all 
> allocations. 
> 
> Not only allocation drops, the runtime performance(ns/op) also increases from 
> 3.34% to 18.91%. 
> 
> Before: 
> 
> $$make test 
> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars"
>  MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm 
> $HOME/Devel/jdk_baseline/bin/java" 
> 
> Benchmark   
> (MIXED_SIZE)  Mode  Cnt Score Error   Units
> StringBuilders.toStringWithMixedChars 
>128  avgt   15   649.846 ±  76.291   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>128  avgt   15   872.855 ± 128.259  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>128  avgt   15   880.121 ±   0.050B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>128  avgt   15   707.730 ± 194.421  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>128  avgt   15   706.602 ±  94.504B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>128  avgt   15 0.001 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>128  avgt   15 0.001 ±   0.001B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>128  avgt   15   113.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>128  avgt   1585.000ms
> StringBuilders.toStringWithMixedChars 
>256  avgt   15  1316.652 ± 112.771   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>256  avgt   15   800.864 ±  76.869  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>256  avgt   15  1648.288 ±   0.162B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>256  avgt   15   599.736 ± 174.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>256  avgt   15  1229.669 ± 318.518B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>256  avgt   15 0.001 ±   0.001  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>256  avgt   15 0.001 ±   0.002B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>256  avgt   15   133.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>256  avgt   1592.000ms
> StringBuilders.toStringWithMixedChars 
>   1024  avgt   15  5204.303 ± 418.115   ns/op
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate  
>   1024  avgt   15   768.730 ±  72.945  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm 
>   1024  avgt   15  6256.844 ±   0.358B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space 
>   1024  avgt   15   655.852 ± 121.602  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm
>   1024  avgt   15  5315.265 ± 578.878B/op
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space 
>   1024  avgt   15 0.002 ±   0.002  MB/sec
> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm
>   1024  avgt   15 0.014 ±   0.011B/op
> StringBuilders.toStringWithMixedChars:·gc.count   
>   1024  avgt   1596.000counts
> StringBuilders.toStringWithMixedChars:·gc.time
>   1024  avgt   1586.000ms
> 
> 
>