Hi Peter, Tagir,
On 2020-06-14 23:04, Peter Levart wrote:
Hi Tagir,
I think that there might be cases where one of the arguments of
concatenation is constant empty String. I found myself sometimes doing
"" + someNonStringValue (instead of more readable
String.valueOf(someNonStringValue)) simply because of laziness. So does
this optimization help in that case as much as with non-constant "" +
"longlonglongline" ?
perhaps we should consider adding unary concat specializations:
http://cr.openjdk.java.net/~redestad/scratch/unary_concat.00/
For primitive arguments we just return a handle to String.valueOf,
which helps with bootstrap overheads (a Hello World that does "" +
args.length drops from 66ms to 51ms on my machine).
"" + stringArg can't be reduced to the objectStringifier, since we
need that new String wrapper.
Results on the added concatEmpty micros - before:
Benchmark (intValue) Mode Cnt Score Error Units
StringConcat.concatEmptyInt 4711 avgt 5 15.461 ± 1.266 ns/op
StringConcat.concatEmptyString 4711 avgt 5 8.173 ± 0.515 ns/op
- after:
Benchmark (intValue) Mode Cnt Score Error Units
StringConcat.concatEmptyInt 4711 avgt 5 15.470 ± 1.235 ns/op
StringConcat.concatEmptyString 4711 avgt 5 18.420 ± 1.572 ns/op
This doesn't help the case where the arguments are both arguments andone
happen to be an empty string. Tagir's patch would still make sense
there.
/Claes
Also, for cases where one argument evaluates to empty string and the
other is not a String, I think you could simply return the seconds
string (not creating another instance with reused value/coder):
if (s1.isEmpty()) {
return s2 == second ? new String(s2.value(), s2.coder()) : s2;
}
Regards, Peter
On 6/13/20 7:08 AM, Tagir Valeev wrote:
Hello!
It's quite possible that when we concatenate two strings, one of them
appears to be empty. We cannot simply return another string in this
case, as JLS 15.18.1 explicitly says (for unknown to me reason) about
the result of the string concatenation expression that 'The String
object is newly created'. However, it's still possible to reuse the
internal array of another string to reduce allocations:
--- src/java.base/share/classes/java/lang/StringConcatHelper.java
(revision 58998:04e3d254c76be87788a40cbd66d013140ea951d8)
+++ src/java.base/share/classes/java/lang/StringConcatHelper.java
(revision 58998+:04e3d254c76b+)
@@ -420,6 +420,12 @@
static String simpleConcat(Object first, Object second) {
String s1 = stringOf(first);
String s2 = stringOf(second);
+ if (s1.isEmpty()) {
+ return new String(s2.value(), s2.coder());
+ }
+ if (s2.isEmpty()) {
+ return new String(s1.value(), s1.coder());
+ }
// start "mixing" in length and coder or arguments, order is
not
// important
long indexCoder = mix(initialCoder(), s2);
Very simple benchmark like this validates that the concatenation
became faster if one of the strings is empty:
@Param({"", "longlonglongline"})
String data;
@Param({"", "longlonglongline"})
String data2;
@Benchmark
public String plus() {
return data + data2;
}
Without patch I observe on VM 15-ea+20-899:
Benchmark (data) (data2) Mode Cnt Score Error
Units
plus avgt 30 15,335 ± 0,186
ns/op
plus longlonglongline avgt 30 19,867 ± 0,109
ns/op
plus longlonglongline avgt 30 20,283 ± 0,230
ns/op
plus longlonglongline longlonglongline avgt 30 26,047 ± 0,230
ns/op
With patch:
Benchmark (data) (data2) Mode Cnt Score Error
Units
plus avgt 30 6,668 ± 0,055
ns/op
plus longlonglongline avgt 30 6,708 ± 0,114
ns/op
plus longlonglongline avgt 30 7,003 ± 0,064
ns/op
plus longlonglongline longlonglongline avgt 30 25,126 ± 0,392
ns/op
There could be an added cost of up to two branches for the normal case
(I believe, if one of the strings is constant, then decent JIT can
eliminate one of the branches). However, I believe, the benefit could
outweigh it, as empty strings are not that uncommon and in this case,
we reduce O(N) time and memory consumption to O(1).
What do you think? Is this a reasonable thing to do? I can file an
issue and submit a proper webrev if it looks like a useful patch.
With best regards,
Tagir Valeev.