----- Mail original -----
> De: "Сергей Цыпанов" <sergei.tsypa...@yandex.ru>
> À: "jonathan gibbons" <jonathan.gibb...@oracle.com>, "core-libs-dev" 
> <core-libs-dev@openjdk.java.net>
> Envoyé: Mardi 4 Février 2020 08:53:31
> Objet: Re: [PATCH] Enhancement proposal for java.util.StringJoiner

> Hello,

Hi Sergey,

> 
> I'd probably agree about a new class in java.lang, but what is wrong about
> exposing package-private method
> which doesn't modify the state of the object and has no side effects?

You can not change the implementation anymore,
by example if instead of having a split between latin1 and non latin1, we 
decide in the future to split between utf8 and non utf8.

If you want to optimize StringJoiner, the best way to do it is to use the 
shared secret mechanism so a java.util class can see implementation details of 
a java.lang class without exposing those details publicly.
As an example, take a look to EnumSet and its implementations.

regards,
Rémi

> 
> 04.02.2020, 00:58, "Jonathan Gibbons" <jonathan.gibb...@oracle.com>:
>> Sergey,
>>
>> It is equally bad to create a new class in the java.lang package as it
>> is to add a new public method to java.lang.String.
>>
>> -- Jon
>>
>> On 2/3/20 2:38 PM, Сергей Цыпанов wrote:
>>>  Hello,
>>>
>>>  as of JDK14 java.util.StringJoiner still uses char[] as a storage of glued
>>>  Strings.
>>>
>>>  This applies for the cases when all joined Strings as well as delimiter, 
>>> prefix
>>>  and suffix contain only ASCII symbols.
>>>
>>>  As a result when StringJoiner.toString() is invoked, byte[] stored in 
>>> String is
>>>  inflated in order to fill in char[] and
>>>  finally char[] is compressed when constructor of String is called:
>>>
>>>  String delimiter = this.delimiter;
>>>  char[] chars = new char[this.len + addLen];
>>>  int k = getChars(this.prefix, chars, 0);
>>>  if (size > 0) {
>>>       k += getChars(elts[0], chars, k); // inflate byte[] -> char[]
>>>
>>>       for(int i = 1; i < size; ++i) {
>>>           k += getChars(delimiter, chars, k);
>>>           k += getChars(elts[i], chars, k);
>>>       }
>>>  }
>>>
>>>  k += getChars(this.suffix, chars, k);
>>>  return new String(chars); // compress char[] -> byte[]
>>>
>>>  This can be improved by detecting cases when String.isLatin1() returns 
>>> true for
>>>  all involved Strings.
>>>
>>>  I've prepared a patch along with benchmark proving that this change is 
>>> correct
>>>  and brings improvement.
>>>  The only concern I have is about String.isLatin1(): as far as String 
>>> belongs to
>>>  java.lang and StringJoiner to java.util
>>>  package-private String.isLatin1() cannot be directly accessed, we need to 
>>> make
>>>  it public for successful compilation.
>>>
>>>  Another solution is to create an intermediate utility class located in 
>>> java.lang
>>>  which delegates the call to String.isLatin1():
>>>
>>>  package java.lang;
>>>
>>>  public class StringHelper {
>>>       public static boolean isLatin1(String str) {
>>>           return str.isLatin1();
>>>       }
>>>  }
>>>
>>>  This allows to keep java.lang.String intact and have access to it's
>>>  package-private method outside of java.lang package.
>>>
>>>  Below I've added results of benchmarking for specified case (all Strings 
>>> are
>>>  Latin1). The other case (at least one String is UTF-8) uses existing code 
>>> so
>>>  there will be only a tiny regression due to several if-checks.
>>>
>>>  With best regards,
>>>  Sergey Tsypanov
>>>
>>>                                             (count) (length) Original 
>>> Patched Units
>>>  stringJoiner 1 1 26.7 ± 1.3 38.2 ± 1.1 ns/op
>>>  stringJoiner 1 5 27.4 ± 0.0 40.5 ± 2.2 ns/op
>>>  stringJoiner 1 10 29.6 ± 1.9 38.4 ± 1.9 ns/op
>>>  stringJoiner 1 100 61.1 ± 6.9 47.6 ± 0.6 ns/op
>>>  stringJoiner 5 1 91.1 ± 6.7 83.6 ± 2.0 ns/op
>>>  stringJoiner 5 5 96.1 ± 10.7 85.6 ± 1.1 ns/op
>>>  stringJoiner 5 10 105.5 ± 14.3 84.7 ± 1.1 ns/op
>>>  stringJoiner 5 100 266.6 ± 30.1 139.6 ± 14.0 ns/op
>>>  stringJoiner 10 1 190.7 ± 23.0 162.0 ± 2.9 ns/op
>>>  stringJoiner 10 5 200.0 ± 16.9 167.5 ± 11.0 ns/op
>>>  stringJoiner 10 10 216.4 ± 12.4 164.8 ± 1.7 ns/op
>>>  stringJoiner 10 100 545.3 ± 49.7 282.2 ± 12.0 ns/op
>>>  stringJoiner 100 1 1467.0 ± 90.3 1302.0 ± 18.5 ns/op
>>>  stringJoiner 100 5 1491.8 ± 166.2 1493.0 ± 135.4 ns/op
>>>  stringJoiner 100 10 1768.8 ± 160.6 1760.8 ± 111.4 ns/op
>>>  stringJoiner 100 100 3654.3 ± 113.1 3120.9 ± 175.9 ns/op
>>>
>>>  stringJoiner:·gc.alloc.rate.norm 1 1 120.0 ± 0.0 120.0 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 1 5 128.0 ± 0.0 120.0 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 1 10 144.0 ± 0.0 136.0 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 1 100 416.0 ± 0.0 312.0 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 5 1 144.0 ± 0.0 136.0 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 5 5 200.0 ± 0.0 168.0 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 5 10 272.0 ± 0.0 216.0 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 5 100 1632.0 ± 0.0 1128.0 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 10 1 256.0 ± 0.0 232.0 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 10 5 376.0 ± 0.0 312.0 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 10 10 520.0 ± 0.0 408.0 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 10 100 3224.1 ± 0.0 2216.1 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 100 1 1760.2 ± 14.9 1544.2 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 100 5 2960.3 ± 14.9 2344.2 ± 0.0 B/op
>>>  stringJoiner:·gc.alloc.rate.norm 100 10 4440.4 ± 0.0 3336.3 ± 0.0 B/op
> >>  stringJoiner:·gc.alloc.rate.norm 100 100 31449.3 ± 12.2 21346.7 ± 14.7 
> >> B/op

Reply via email to