Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString

Mike Duigou Tue, 30 Apr 2013 11:59:46 -0700

Finally some follow up on this!

On Feb 13 2013, at 23:43 , Martin Buchholz wrote:


> I like the optimizations in this change, especially the avoidance of copies. 
> Was there some good reason the jdk never before used JavaLangAccess to avoid 
> String creation copies?

Not that I am aware of. I did work on String and thereafter the idea occurred 
to me. Copying...we hates it.

>  I am tempted to micro-optimize this some more, e.g. specialize the hex 
> printing code.  I might get rid of the digits table and compute ASCII 
> characters simply:
> 
> ((i < 10) ? '0' : ('a' - 10)) + i

I wanted to play with JMH so used this as an excuse.

It turns out that using a calculation is a performance loser:

Benchmark                                   Thr    Cnt  Sec         Mean   Mean 
error          Var    Units
o.o.j.s.g.t.DigitsTableVsCalc.digitsCalc      1      8    1  8463376.310     
2516.169  4136965.190  ops/sec
o.o.j.s.g.t.DigitsTableVsCalc.digitsTable     1      8    1 11838194.938     
4936.538 15923812.479  ops/sec

The source of the benchmarks:

    @OutputTimeUnit(TimeUnit.SECONDS)
    @GenerateMicroBenchmark(BenchmarkType.OpsPerTimeUnit)
    public void digitsTable() throws InterruptedException {
        int end = testData.length;
        for (int each = 0; each < end; each++) {
            int i = testData[each];
            output[each] = digits[i];
        }
    }

    @OutputTimeUnit(TimeUnit.SECONDS)
    @GenerateMicroBenchmark(BenchmarkType.OpsPerTimeUnit)
    public void digitsCalc() throws InterruptedException {
        int end = testData.length;
        for (int each = 0; each < end; each++) {
            int i = testData[each];
            int base = i < 10 ? 48 : 87;
            output[each] = (char) (base + i);
        }
    }

Where testData is a 100 element static int array of random digit values (0-35).

> Why not formatUnsignedInt?

Now added and it seems to help.

> Why not make the new format methods public?

They seemed too specialized to me, particularly with the constraints on radix.

> 
> Is there a better name than createStringSharedChars?  We hope those chars 
> will *not* be shared!  How about newStringUnsafe(char[] chars)
> 
> The spec for
> +    int formatUnsignedLong(long val, int shift, char[] buf, int offset, int 
> len);
> should make it clearer whose responsibility getting the size right is.  It 
> looks like the caller has to ensure there will be enough space, so perhaps 
> the caller should just pass one offset instead of offset plus len.

Seems reasonable.

> +     * @return the lowest character  location used
> stray space

Corrected.

> 
> 
> On Wed, Feb 13, 2013 at 2:45 PM, Mike Duigou <mike.dui...@oracle.com> wrote:
> I have updated the patch with some of Ulf's feedback and corrected one 
> cut-and-paste error that I made.
> 
> The updated webrev is at:
> 
> http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/
> 
> Mike
> 
> On Feb 12 2013, at 18:25 , Ulf Zibis wrote:
> 
> > Am 13.02.2013 02:34, schrieb Mike Duigou:
> >> Thank you for the comments Ulf.
> >>
> >> On Feb 12 2013, at 17:24 , Ulf Zibis wrote:
> >>
> >>> Am 13.02.2013 00:30, schrieb Mike Duigou:
> >>>> Hi Steven;
> >>>>
> >>>> I have updated the patch for Java 8. There's somewhat less code sharing 
> >>>> and a bit of refactoring than your last version but the performance 
> >>>> should be about the same or a little better.
> >>>>
> >>>> http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/
> >>> Couldn't you use String(buf, true) for all to(Unsigned)String(...) 
> >>> methods ?
> >> Possibly. I didn't go looking too far with looking for additional 
> >> improvements.
> >>
> >>> Instead of calculating the mask each time, you could use:
> >>> 309     private static String toUnsignedString(int i, int shift, int 
> >>> mask) {
> >> I think that would actually be inefficient. I haven't looked at the JITed 
> >> code but the mask calculation is pretty cheap relative to parameter 
> >> passing.
> >
> > I believe, JIT will inline the code, so there would be no parameter passing.
> >
> > Additionally the calculation of char count could be faster, if you would 
> > have short cuts like:
> > numberOfLeading2Zeros(i)
> > numberOfLeading4Zeros(i)
> > numberOfLeading8Zeros(i)
> > ...
> > So the optimum would be with separate toString methods:
> > String toBase2String(int i);
> > String toBase4String(int i);
> > String toBase8String(int i);
> > ...
> >
> > In any case I would save 2 lines:
> >
> > 311         int mag = Integer.SIZE - Long.numberOfLeadingZeros(i);
> > 312         int charCount = Math.max(((mag + (shift - 1)) / shift), 1);
> > 313         char[] buf = new char[charCount];
> > 316         int mask = (1 << shift) - 1;
> >
> > -Ulf
> >
> 
>

Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString

Reply via email to