On Thu, 18 Feb 2021 20:14:12 GMT, Сергей Цыпанов
<[email protected]> wrote:
>> Some of these changes conflict with #2334, which suggest removing the
>> `coder` and `isLatin1` methods from `String`.
>>
>> As a more general point I think it would be good to explore options that
>> does not increase leakage of the implementation detail that `Strings` are
>> latin1- or utf16-encoded outside of java.lang.
>
> Hi @cl4es,
>> Some of these changes conflict with #2334, which suggest removing the
>> `coder` and `isLatin1` methods from `String`.
>
> I've checked out Aleksey's branch and applied my changes onto it, the only
> thing that I changed to make it work is replacing
> public boolean isLatin1(String str) {
> return str.isLatin1();
> }
> with
> public boolean isLatin1(String str) {
> return str.coder == String.LATIN1;
> }
> The rest of the code was left intact. `jdk:tier1` is OK after the change.
>> As a more general point I think it would be good to explore options that
>> does not increase leakage of the implementation detail that `Strings` are
>> latin1- or utf16-encoded outside of java.lang.
>
> Apart from `JavaLangAccess` the only thing that comes to my mind is
> reflection, but it will destroy all the improvement. Otherwise I cannot
> figure out any other way to access somehow package-private latin/non-latin
> functionality of `j.l.String` in `java.util` package. I wonder, whether I'm
> missing any other opportunities?
A less intrusive alternative would be to use a `StringBuilder`, see changes in
this branch:
https://github.com/openjdk/jdk/compare/master...cl4es:stringjoin_improvement?expand=1
(I adapted your StringJoinerBenchmark to work with the ascii-only build
constraint)
This underperforms compared to your patch since StringBuilder.toString needs to
do a copy, but improves over the baseline:
Benchmark (count)
(length) (mode) Mode Cnt Score Error Units
StringJoinerBenchmark.stringJoiner 100
64 latin avgt 5 5420.701 ± 1433.485 ns/op
StringJoinerBenchmark.stringJoiner:·gc.alloc.rate.norm 100
64 latin avgt 5 20640.428 ± 0.130 B/op
Patch:
Benchmark (count)
(length) (mode) Mode Cnt Score Error Units
StringJoinerBenchmark.stringJoiner 100
64 latin avgt 5 4271.401 ± 677.560 ns/op
StringJoinerBenchmark.stringJoiner:·gc.alloc.rate.norm 100
64 latin avgt 5 14136.294 ± 0.095 B/op
The comparative benefit is that we'd avoid punching more holes into String
implementation details for now. Not ruling that out indefinitely, but I think
it needs a stronger motivation than to improve StringJoiner alone.
-------------
PR: https://git.openjdk.java.net/jdk/pull/2627