Finally some follow up on this! On Feb 13 2013, at 23:43 , Martin Buchholz wrote:
> I like the optimizations in this change, especially the avoidance of copies. > Was there some good reason the jdk never before used JavaLangAccess to avoid > String creation copies? Not that I am aware of. I did work on String and thereafter the idea occurred to me. Copying...we hates it. > I am tempted to micro-optimize this some more, e.g. specialize the hex > printing code. I might get rid of the digits table and compute ASCII > characters simply: > > ((i < 10) ? '0' : ('a' - 10)) + i I wanted to play with JMH so used this as an excuse. It turns out that using a calculation is a performance loser: Benchmark Thr Cnt Sec Mean Mean error Var Units o.o.j.s.g.t.DigitsTableVsCalc.digitsCalc 1 8 1 8463376.310 2516.169 4136965.190 ops/sec o.o.j.s.g.t.DigitsTableVsCalc.digitsTable 1 8 1 11838194.938 4936.538 15923812.479 ops/sec The source of the benchmarks: @OutputTimeUnit(TimeUnit.SECONDS) @GenerateMicroBenchmark(BenchmarkType.OpsPerTimeUnit) public void digitsTable() throws InterruptedException { int end = testData.length; for (int each = 0; each < end; each++) { int i = testData[each]; output[each] = digits[i]; } } @OutputTimeUnit(TimeUnit.SECONDS) @GenerateMicroBenchmark(BenchmarkType.OpsPerTimeUnit) public void digitsCalc() throws InterruptedException { int end = testData.length; for (int each = 0; each < end; each++) { int i = testData[each]; int base = i < 10 ? 48 : 87; output[each] = (char) (base + i); } } Where testData is a 100 element static int array of random digit values (0-35). > Why not formatUnsignedInt? Now added and it seems to help. > Why not make the new format methods public? They seemed too specialized to me, particularly with the constraints on radix. > > Is there a better name than createStringSharedChars? We hope those chars > will *not* be shared! How about newStringUnsafe(char[] chars) > > The spec for > + int formatUnsignedLong(long val, int shift, char[] buf, int offset, int > len); > should make it clearer whose responsibility getting the size right is. It > looks like the caller has to ensure there will be enough space, so perhaps > the caller should just pass one offset instead of offset plus len. Seems reasonable. > + * @return the lowest character location used > stray space Corrected. > > > On Wed, Feb 13, 2013 at 2:45 PM, Mike Duigou <mike.dui...@oracle.com> wrote: > I have updated the patch with some of Ulf's feedback and corrected one > cut-and-paste error that I made. > > The updated webrev is at: > > http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/ > > Mike > > On Feb 12 2013, at 18:25 , Ulf Zibis wrote: > > > Am 13.02.2013 02:34, schrieb Mike Duigou: > >> Thank you for the comments Ulf. > >> > >> On Feb 12 2013, at 17:24 , Ulf Zibis wrote: > >> > >>> Am 13.02.2013 00:30, schrieb Mike Duigou: > >>>> Hi Steven; > >>>> > >>>> I have updated the patch for Java 8. There's somewhat less code sharing > >>>> and a bit of refactoring than your last version but the performance > >>>> should be about the same or a little better. > >>>> > >>>> http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/ > >>> Couldn't you use String(buf, true) for all to(Unsigned)String(...) > >>> methods ? > >> Possibly. I didn't go looking too far with looking for additional > >> improvements. > >> > >>> Instead of calculating the mask each time, you could use: > >>> 309 private static String toUnsignedString(int i, int shift, int > >>> mask) { > >> I think that would actually be inefficient. I haven't looked at the JITed > >> code but the mask calculation is pretty cheap relative to parameter > >> passing. > > > > I believe, JIT will inline the code, so there would be no parameter passing. > > > > Additionally the calculation of char count could be faster, if you would > > have short cuts like: > > numberOfLeading2Zeros(i) > > numberOfLeading4Zeros(i) > > numberOfLeading8Zeros(i) > > ... > > So the optimum would be with separate toString methods: > > String toBase2String(int i); > > String toBase4String(int i); > > String toBase8String(int i); > > ... > > > > In any case I would save 2 lines: > > > > 311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i); > > 312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1); > > 313 char[] buf = new char[charCount]; > > 316 int mask = (1 << shift) - 1; > > > > -Ulf > > > >