Yes, I agree that StringJoiner could benefit from a hint about the
expected number of elements to join.
On the other hand, with the current allocation scheme, each reference
stored in the elts[] will only be copied at most twice on average, so
the total performance improvement might not be that significant to
justify the API change.
With kind regards,
Ivan
On 3/9/19 11:52 AM, Сергей Цыпанов wrote:
Hello Ivan,
indeed your solution for Iterables is more compact than mine (it can be event
shorter with method reference), however it doesn't solve the problem of array
reallocation.
See my response to Remi below
08.03.2019, 22:01, "Ivan Gerasimov" <ivan.gerasi...@oracle.com>:
Hi Sergei!
As you said, this new class is pretty much like StringJoiner with
reduced functionality.
For appending all elements of an Iterable you could use list.forEach(s
-> sj.add(s)).
With kind regards,
Ivan
On 3/8/19 11:22 AM, Сергей Цыпанов wrote:
Hello,
I have an enhancement proposal for some cases of String concatenation in Java.
Currently we concat Strings mostly using java.lang.StringBuilder. The main
disadvantage of StringBuilder is underlying char array or rather a need to
resize it when the capacity is about to exceed array length and subsequent
copying of array content into newly allocated array.
One alternative solution existing is StringJoiner. Before JDK 9 it was a kind
of decorator over StringBuilder, but later it was reworked in order to store
appended Strings into String[] and overall capacity accumulated into int field.
This makes it possible to allocate char[] only once and of exact size in
toString() method reducing allocation cost.
My proposal is to copy-paste the code of StringJoinder into newly created
class java.util.StringChain, drop the code responsible for delimiter, prefix
and suffix and use it instead of StringBuilder in common StringBuilder::append
concatenation pattern.
Possible use-cases for proposed code are:
- plain String concatenation
- String::chain (new methods)
- Stream.collect(Collectors.joining())
- StringConcatFactory
We can create new methods String.chain(Iterable<CharSequence>) and
String.chain(CharSequence...) which allow to encapsulate boilerplate code like
StringBuilder sb = new StringBuilder();
for (CharSequence cs : charSequences) {
sb.append(cs);
}
String result = sb.toString():
into one line:
String result = String.chain(charSequences);
As of performance I've done some measurements using JMH on my work machine
(Intel i7-7700) for both Latin and non-Latin Strings of different size and
count.
Here are the results:
https://github.com/stsypanov/string-chain/blob/master/results/StringBuilderVsStringChainBenchmark.txt
There is a few corner cases (e.g. 1000 Strings of length 1 appended) when
StringBuilder takes over StringChain constructed with default capacity of 8,
but StringChain constructed with exact added Strings count almost always wins,
especially when dealing with non-Latin chars (Russian in my case).
I've also created a separate repo on GitHub with benchmarks:
https://github.com/stsypanov/string-chain
Key feature here is ability to allocate String array of exact size is cases
we know added elements count.
Thus I think that if the change will be accepted we can add also an overloaded
method String.chain(Collection<CharSequence>) as Collection::size allows to
contruct StringChain of exact size.
Patch is attached.
Kind regards,
Sergei Tsypanov
--
With kind regards,
Ivan Gerasimov
--
With kind regards,
Ivan Gerasimov