Re: RFR: 8013395 StringBuffer.toString performance regression impacting embedded benchmarks

Peter Levart Fri, 10 May 2013 00:19:39 -0700

Hi David,

One remote incompatibility note: the String returned fromStringBuffer.toString() is retained by StringBuffer until the next callto StringBuffer mutating method. This can be observed for example if thereturned String object is wrapped by a WeakReference. This is really aremotely incompatible difference in external behaviour, but it could befixed by the following variation of toString():


    private transient char[] toStringCache;

    @Override
    public synchronized String toString() {
        if (toStringCache == null) {
            toStringCache = Arrays.copyOfRange(value, 0, count);
        }
        return new String(toStringCache, true);
    }


Regards, Peter

On 05/10/2013 08:03 AM, David Holmes wrote:

Short version:
Cache the value returned by toString and use it to copy-construct anew String on subsequent calls to toString(). Clear the cache on anymutating operation.
webrev: http://cr.openjdk.java.net/~dholmes/8013395/webrev.v2/
Testing: microbenchmark for toString performance; new regression testfor correctness; JPRT testset core as a sanity check
Still TBD - full SE benchmark (?)

Thanks,
David
---------

Long version:
One of the goals for JDK8 is to provide a path from Java ME CDC toJava SE (or SE Embedded). In the embedded space some pretty oldbenchmarks still get used for doing comparisons between JRE's. One ofwhich makes heavy use of StringBuffer.toString, without modifying theStringBuffer in between.
Up to Java 1.4.2 a StringBuffer and a String could share theunderlying char[]. This meant that toString simply needed to create anew String that referenced the StringBuffer's char[] with no copyingof the array needed. In Java 5 the String/StringBuffer implementationswere completely revised: StringBuilder was introduced fornon-synchronized use, and a new AbstractStringBuilder base class addedfor it and StringBuffer. In that implementation toString now has tocopy the StringBuffer's char[]. This resulted in a significantperformance regression for toString() and a bug - 6219959 - wasopened. There is quite an elaborate evaluation in that bug report butbottom line was that "real code doesn't depend on this - won't fix".
At some stage ME also updated to the new Java 5 code and they alsonoticed the problem. As a result CDC6 included a variation of thecaching strategy that is proposed here.
Going forward because we want people to be able to compare ME and SEwith their familiar benchmarks, we would like to address this cornercase and fix it using the caching strategy outlined. As a data pointan 8K StringBuffer that takes ~1ms to be converted to a Stringinitially, can process subsequent toString() calls in a fewmicroseconds. So that performance issue is addressed.
However we've added a write to a field in all the mutating methodswhich obviously adds some additional computational effort - though Ihave no doubt it is lost in the noise for all but the smallest ofmutating methods. Even so this should be run against regular SEbenchmarks to ensure there are no performance regressions there - soif anyone has a suggestion as to the best benchmark to run to exerciseStringBuffer (if it exists), please let me know.
Thanks for reading this far :)

Re: RFR: 8013395 StringBuffer.toString performance regression impacting embedded benchmarks

Reply via email to