Hi David,
On 10/05/2013 4:39 PM, David Schlosnagle wrote:
One other quick comment, if the toStringCache is non-null during
invocation of toString(), that seems to imply that the
StringBuffer/StringBuilder has not been mutated since the last
invocation of toString(), is there any reason to still use the String
copy constructor? This would address the
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6239985 portion of
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6219959.
For example, I'd envisage in AbstractStringBuilder (assuming previous
comments of pushing toStringCache from StringBuffer to
AbstractStringBuilder):
@Override
public String toString() {
if (toStringCache == null) {
toStringCache = new String(value, 0, count);
}
return toStringCache;
}
Right this was our first thought too, but the specification for toString
states that a new String is created. So to go this path we'd also have
to push through a spec change for StringBuilder/StringBuffer. Given this
is an obscure corner case I wanted to go the most minimal route
possible. The copy-constructor doesn't copy the array which is what
6239985 was complaining about.
Thanks,
David
- Dave
On Fri, May 10, 2013 at 2:25 AM, David Schlosnagle <schlo...@gmail.com
<mailto:schlo...@gmail.com>> wrote:
Hi David,
Would it be beneficial to push the toStringCache up to
AbstractStringBuilder so that StringBuilder.toString() benefits from
this cache as well? It seems like this would affect both
StringBuilder and StringBuffer for repeated calls to toString(),
although StringBuffer would obviously have the synchronization
overhead as well (assuming the locking is not elided away by HotSpot).
Thanks,
Dave
On Fri, May 10, 2013 at 2:03 AM, David Holmes
<david.hol...@oracle.com <mailto:david.hol...@oracle.com>> wrote:
Short version:
Cache the value returned by toString and use it to
copy-construct a new String on subsequent calls to toString().
Clear the cache on any mutating operation.
webrev: http://cr.openjdk.java.net/~__dholmes/8013395/webrev.v2/
<http://cr.openjdk.java.net/~dholmes/8013395/webrev.v2/>
Testing: microbenchmark for toString performance; new regression
test for correctness; JPRT testset core as a sanity check
Still TBD - full SE benchmark (?)
Thanks,
David
---------
Long version:
One of the goals for JDK8 is to provide a path from Java ME CDC
to Java SE (or SE Embedded). In the embedded space some pretty
old benchmarks still get used for doing comparisons between
JRE's. One of which makes heavy use of StringBuffer.toString,
without modifying the StringBuffer in between.
Up to Java 1.4.2 a StringBuffer and a String could share the
underlying char[]. This meant that toString simply needed to
create a new String that referenced the StringBuffer's char[]
with no copying of the array needed. In Java 5 the
String/StringBuffer implementations were completely revised:
StringBuilder was introduced for non-synchronized use, and a new
AbstractStringBuilder base class added for it and StringBuffer.
In that implementation toString now has to copy the
StringBuffer's char[]. This resulted in a significant
performance regression for toString() and a bug - 6219959 - was
opened. There is quite an elaborate evaluation in that bug
report but bottom line was that "real code doesn't depend on
this - won't fix".
At some stage ME also updated to the new Java 5 code and they
also noticed the problem. As a result CDC6 included a variation
of the caching strategy that is proposed here.
Going forward because we want people to be able to compare ME
and SE with their familiar benchmarks, we would like to address
this corner case and fix it using the caching strategy outlined.
As a data point an 8K StringBuffer that takes ~1ms to be
converted to a String initially, can process subsequent
toString() calls in a few microseconds. So that performance
issue is addressed.
However we've added a write to a field in all the mutating
methods which obviously adds some additional computational
effort - though I have no doubt it is lost in the noise for all
but the smallest of mutating methods. Even so this should be run
against regular SE benchmarks to ensure there are no performance
regressions there - so if anyone has a suggestion as to the best
benchmark to run to exercise StringBuffer (if it exists), please
let me know.
Thanks for reading this far :)