Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v2]
On Tue, 8 Mar 2022 17:33:58 GMT, Daniel Jeliński wrote: > As for a better name for `growOnly`, something like `mayBeLatin` would better > convey the variable's purpose. What do you think? There are tricky. I need to add test to cover them. The problem comes from that this patch fails to copy over the attribute 'growOnly/maybeLatin1' from the other AbstractStringBuilder. I think we can fix this loophole. Other types such as String are well-formed. they don't suffer from this issue. - PR: https://git.openjdk.java.net/jdk/pull/7671
Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v2]
On Mon, 7 Mar 2022 23:36:41 GMT, Xin Liu wrote: >> If AbstractStringBuilder only grow, the inflated value which has been >> encoded in UTF16 can't be compressed. >> toString() can skip compression in this case. This can save an >> ArrayAllocation in StringUTF16::compress(). >> >> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. >> >> In microbench, we expect to see that allocation/op reduces 20%. The initial >> capacity of StringBuilder is S in bytes. When it encounters the 1st >> character that can't be encoded in LATIN1, it inflates and allocate a new >> array of 2*S. `toString()` will try to compress that value so it need to >> allocate S bytes. The last step allocates 2*S bytes because it has to copy >> the string. so it requires to allocate 5 * S bytes in total. By skipping >> the failed compression, it only allocates 4 * S bytes. that's 20%. In real >> execution, we observe 16% allocation reduction, tracked by JMH GC profiler >> `gc.alloc.rate.norm `. I think it's because HotSpot can't track all >> allocations. >> >> Not only allocation drops, the runtime performance(ns/op) also increases >> from 3.34% to 18.91%. >> >> Before: >> >> $$make test >> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars" >> MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm >> $HOME/Devel/jdk_baseline/bin/java" >> >> Benchmark >> (MIXED_SIZE) Mode Cnt Score Error Units >> StringBuilders.toStringWithMixedChars >> 128 avgt 15 649.846 ± 76.291 ns/op >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate >> 128 avgt 15 872.855 ± 128.259 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm >> 128 avgt 15 880.121 ± 0.050B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space >> 128 avgt 15 707.730 ± 194.421 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm >> 128 avgt 15 706.602 ± 94.504B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space >> 128 avgt 15 0.001 ± 0.002 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm >> 128 avgt 15 0.001 ± 0.001B/op >> StringBuilders.toStringWithMixedChars:·gc.count >> 128 avgt 15 113.000counts >> StringBuilders.toStringWithMixedChars:·gc.time >> 128 avgt 1585.000ms >> StringBuilders.toStringWithMixedChars >> 256 avgt 15 1316.652 ± 112.771 ns/op >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate >> 256 avgt 15 800.864 ± 76.869 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm >> 256 avgt 15 1648.288 ± 0.162B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space >> 256 avgt 15 599.736 ± 174.001 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm >> 256 avgt 15 1229.669 ± 318.518B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space >> 256 avgt 15 0.001 ± 0.001 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm >> 256 avgt 15 0.001 ± 0.002B/op >> StringBuilders.toStringWithMixedChars:·gc.count >> 256 avgt 15 133.000counts >> StringBuilders.toStringWithMixedChars:·gc.time >> 256 avgt 1592.000ms >> StringBuilders.toStringWithMixedChars >>1024 avgt 15 5204.303 ± 418.115 ns/op >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate >>1024 avgt 15 768.730 ± 72.945 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm >>1024 avgt 15 6256.844 ± 0.358B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space >>1024 avgt 15 655.852 ± 121.602 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm >>1024 avgt 15 5315.265 ± 578.878B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space >>1024 avgt 15 0.002 ± 0.002 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm >>1024 avgt 15 0.014 ± 0.011B/op >> StringBuilders.toStringWithMixedChars:·gc.count >>1024 avgt 1596.000counts >> StringBuilders.toStringWithMixedChars:·gc.time
Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v2]
On Mon, 7 Mar 2022 12:10:51 GMT, Daniel Jeliński wrote: >> Xin Liu has updated the pull request incrementally with one additional >> commit since the last revision: >> >> make sure String(StringBuffer) is still synchronized. > > src/java.base/share/classes/java/lang/String.java line 1446: > >> 1444: */ >> 1445: public String(StringBuffer buffer) { >> 1446: this(buffer, null); > > This method is no longer synchronized on StringBuffer you're right. fixed in revision#2. this could be very tricky to discover. thanks for the head-up! - PR: https://git.openjdk.java.net/jdk/pull/7671
Re: RFR: 8282429: StringBuilder/StringBuffer.toString() skip compressing for UTF16 strings [v2]
> If AbstractStringBuilder only grow, the inflated value which has been encoded > in UTF16 can't be compressed. > toString() can skip compression in this case. This can save an > ArrayAllocation in StringUTF16::compress(). > > java.io.BufferedRead::readLine() is a case that StringBuilder grows only. > > In microbench, we expect to see that allocation/op reduces 20%. The initial > capacity of StringBuilder is S in bytes. When it encounters the 1st character > that can't be encoded in LATIN1, it inflates and allocate a new array of 2*S. > `toString()` will try to compress that value so it need to allocate S bytes. > The last step allocates 2*S bytes because it has to copy the string. so it > requires to allocate 5 * S bytes in total. By skipping the failed > compression, it only allocates 4 * S bytes. that's 20%. In real execution, > we observe 16% allocation reduction, tracked by JMH GC profiler > `gc.alloc.rate.norm `. I think it's because HotSpot can't track all > allocations. > > Not only allocation drops, the runtime performance(ns/op) also increases from > 3.34% to 18.91%. > > Before: > > $$make test > TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars" > MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm > $HOME/Devel/jdk_baseline/bin/java" > > Benchmark > (MIXED_SIZE) Mode Cnt Score Error Units > StringBuilders.toStringWithMixedChars >128 avgt 15 649.846 ± 76.291 ns/op > StringBuilders.toStringWithMixedChars:·gc.alloc.rate >128 avgt 15 872.855 ± 128.259 MB/sec > StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm >128 avgt 15 880.121 ± 0.050B/op > StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space >128 avgt 15 707.730 ± 194.421 MB/sec > StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm >128 avgt 15 706.602 ± 94.504B/op > StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space >128 avgt 15 0.001 ± 0.002 MB/sec > StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm >128 avgt 15 0.001 ± 0.001B/op > StringBuilders.toStringWithMixedChars:·gc.count >128 avgt 15 113.000counts > StringBuilders.toStringWithMixedChars:·gc.time >128 avgt 1585.000ms > StringBuilders.toStringWithMixedChars >256 avgt 15 1316.652 ± 112.771 ns/op > StringBuilders.toStringWithMixedChars:·gc.alloc.rate >256 avgt 15 800.864 ± 76.869 MB/sec > StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm >256 avgt 15 1648.288 ± 0.162B/op > StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space >256 avgt 15 599.736 ± 174.001 MB/sec > StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm >256 avgt 15 1229.669 ± 318.518B/op > StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space >256 avgt 15 0.001 ± 0.001 MB/sec > StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm >256 avgt 15 0.001 ± 0.002B/op > StringBuilders.toStringWithMixedChars:·gc.count >256 avgt 15 133.000counts > StringBuilders.toStringWithMixedChars:·gc.time >256 avgt 1592.000ms > StringBuilders.toStringWithMixedChars > 1024 avgt 15 5204.303 ± 418.115 ns/op > StringBuilders.toStringWithMixedChars:·gc.alloc.rate > 1024 avgt 15 768.730 ± 72.945 MB/sec > StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm > 1024 avgt 15 6256.844 ± 0.358B/op > StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space > 1024 avgt 15 655.852 ± 121.602 MB/sec > StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm > 1024 avgt 15 5315.265 ± 578.878B/op > StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space > 1024 avgt 15 0.002 ± 0.002 MB/sec > StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm > 1024 avgt 15 0.014 ± 0.011B/op > StringBuilders.toStringWithMixedChars:·gc.count > 1024 avgt 1596.000counts > StringBuilders.toStringWithMixedChars:·gc.time > 1024 avgt 1586.000ms > > > After > > $make test >