Hi David,

On 10/05/2013 4:39 PM, David Schlosnagle wrote:
One other quick comment, if the toStringCache is non-null during
invocation of toString(), that seems to imply that the
StringBuffer/StringBuilder has not been mutated since the last
invocation of toString(), is there any reason to still use the String
copy constructor? This would address the
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6239985 portion of
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6219959.

For example, I'd envisage in AbstractStringBuilder (assuming previous
comments of pushing toStringCache from StringBuffer to
AbstractStringBuilder):

     @Override
     public String toString() {
         if (toStringCache == null) {
             toStringCache = new String(value, 0, count);
         }
         return toStringCache;
     }

Right this was our first thought too, but the specification for toString states that a new String is created. So to go this path we'd also have to push through a spec change for StringBuilder/StringBuffer. Given this is an obscure corner case I wanted to go the most minimal route possible. The copy-constructor doesn't copy the array which is what 6239985 was complaining about.

Thanks,
David

- Dave


On Fri, May 10, 2013 at 2:25 AM, David Schlosnagle <schlo...@gmail.com
<mailto:schlo...@gmail.com>> wrote:

    Hi David,

    Would it be beneficial to push the toStringCache up to
    AbstractStringBuilder so that StringBuilder.toString() benefits from
    this cache as well? It seems like this would affect both
    StringBuilder and StringBuffer for repeated calls to toString(),
    although StringBuffer would obviously have the synchronization
    overhead as well (assuming the locking is not elided away by HotSpot).

    Thanks,
    Dave


    On Fri, May 10, 2013 at 2:03 AM, David Holmes
    <david.hol...@oracle.com <mailto:david.hol...@oracle.com>> wrote:

        Short version:

        Cache the value returned by toString and use it to
        copy-construct a new String on subsequent calls to toString().
        Clear the cache on any mutating operation.

        webrev: http://cr.openjdk.java.net/~__dholmes/8013395/webrev.v2/
        <http://cr.openjdk.java.net/~dholmes/8013395/webrev.v2/>

        Testing: microbenchmark for toString performance; new regression
        test for correctness; JPRT testset core as a sanity check

        Still TBD - full SE benchmark (?)

        Thanks,
        David
        ---------

        Long version:

        One of the goals for JDK8 is to provide a path from Java ME CDC
        to Java SE (or SE Embedded). In the embedded space some pretty
        old benchmarks still get used for doing comparisons between
        JRE's. One of which makes heavy use of StringBuffer.toString,
        without modifying the StringBuffer in between.

        Up to Java 1.4.2 a StringBuffer and a String could share the
        underlying char[]. This meant that toString simply needed to
        create a new String that referenced the StringBuffer's char[]
        with no copying of the array needed. In Java 5 the
        String/StringBuffer implementations were completely revised:
        StringBuilder was introduced for non-synchronized use, and a new
        AbstractStringBuilder base class added for it and StringBuffer.
        In that implementation toString now has to copy the
        StringBuffer's char[]. This resulted in a significant
        performance regression for toString() and a bug - 6219959 - was
        opened. There is quite an elaborate evaluation in that bug
        report but bottom line was that "real code doesn't depend on
        this - won't fix".

        At some stage ME also updated to the new Java 5 code and they
        also noticed the problem. As a result CDC6 included a variation
        of the caching strategy that is proposed here.

        Going forward because we want people to be able to compare ME
        and SE with their familiar benchmarks, we would like to address
        this corner case and fix it using the caching strategy outlined.
        As a data point an 8K StringBuffer that takes ~1ms to be
        converted to a String initially, can process subsequent
        toString() calls in a few microseconds. So that performance
        issue is addressed.

        However we've added a write to a field in all the mutating
        methods which obviously adds some additional computational
        effort - though I have no doubt it is lost in the noise for all
        but the smallest of mutating methods. Even so this should be run
        against regular SE benchmarks to ensure there are no performance
        regressions there - so if anyone has a suggestion as to the best
        benchmark to run to exercise StringBuffer (if it exists), please
        let me know.

        Thanks for reading this far :)



Reply via email to