Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString

2013-04-30 Thread Mike Duigou
Finally some follow up on this!

On Feb 13 2013, at 23:43 , Martin Buchholz wrote:

 I like the optimizations in this change, especially the avoidance of copies. 
 Was there some good reason the jdk never before used JavaLangAccess to avoid 
 String creation copies?

Not that I am aware of. I did work on String and thereafter the idea occurred 
to me. Copying...we hates it.

  I am tempted to micro-optimize this some more, e.g. specialize the hex 
 printing code.  I might get rid of the digits table and compute ASCII 
 characters simply:
 
 ((i  10) ? '0' : ('a' - 10)) + i

I wanted to play with JMH so used this as an excuse.

It turns out that using a calculation is a performance loser:

Benchmark   ThrCnt  Sec Mean   Mean 
error  VarUnits
o.o.j.s.g.t.DigitsTableVsCalc.digitsCalc  1  81  8463376.310 
2516.169  4136965.190  ops/sec
o.o.j.s.g.t.DigitsTableVsCalc.digitsTable 1  81 11838194.938 
4936.538 15923812.479  ops/sec

The source of the benchmarks:

@OutputTimeUnit(TimeUnit.SECONDS)
@GenerateMicroBenchmark(BenchmarkType.OpsPerTimeUnit)
public void digitsTable() throws InterruptedException {
int end = testData.length;
for (int each = 0; each  end; each++) {
int i = testData[each];
output[each] = digits[i];
}
}

@OutputTimeUnit(TimeUnit.SECONDS)
@GenerateMicroBenchmark(BenchmarkType.OpsPerTimeUnit)
public void digitsCalc() throws InterruptedException {
int end = testData.length;
for (int each = 0; each  end; each++) {
int i = testData[each];
int base = i  10 ? 48 : 87;
output[each] = (char) (base + i);
}
}

Where testData is a 100 element static int array of random digit values (0-35).

 Why not formatUnsignedInt?

Now added and it seems to help.

 Why not make the new format methods public?

They seemed too specialized to me, particularly with the constraints on radix.

 
 Is there a better name than createStringSharedChars?  We hope those chars 
 will *not* be shared!  How about newStringUnsafe(char[] chars)
 
 The spec for
 +int formatUnsignedLong(long val, int shift, char[] buf, int offset, int 
 len);
 should make it clearer whose responsibility getting the size right is.  It 
 looks like the caller has to ensure there will be enough space, so perhaps 
 the caller should just pass one offset instead of offset plus len.

Seems reasonable.

 + * @return the lowest character  location used
 stray space

Corrected.

 
 
 On Wed, Feb 13, 2013 at 2:45 PM, Mike Duigou mike.dui...@oracle.com wrote:
 I have updated the patch with some of Ulf's feedback and corrected one 
 cut-and-paste error that I made.
 
 The updated webrev is at:
 
 http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/
 
 Mike
 
 On Feb 12 2013, at 18:25 , Ulf Zibis wrote:
 
  Am 13.02.2013 02:34, schrieb Mike Duigou:
  Thank you for the comments Ulf.
 
  On Feb 12 2013, at 17:24 , Ulf Zibis wrote:
 
  Am 13.02.2013 00:30, schrieb Mike Duigou:
  Hi Steven;
 
  I have updated the patch for Java 8. There's somewhat less code sharing 
  and a bit of refactoring than your last version but the performance 
  should be about the same or a little better.
 
  http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/
  Couldn't you use String(buf, true) for all to(Unsigned)String(...) 
  methods ?
  Possibly. I didn't go looking too far with looking for additional 
  improvements.
 
  Instead of calculating the mask each time, you could use:
  309 private static String toUnsignedString(int i, int shift, int 
  mask) {
  I think that would actually be inefficient. I haven't looked at the JITed 
  code but the mask calculation is pretty cheap relative to parameter 
  passing.
 
  I believe, JIT will inline the code, so there would be no parameter passing.
 
  Additionally the calculation of char count could be faster, if you would 
  have short cuts like:
  numberOfLeading2Zeros(i)
  numberOfLeading4Zeros(i)
  numberOfLeading8Zeros(i)
  ...
  So the optimum would be with separate toString methods:
  String toBase2String(int i);
  String toBase4String(int i);
  String toBase8String(int i);
  ...
 
  In any case I would save 2 lines:
 
  311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i);
  312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1);
  313 char[] buf = new char[charCount];
  316 int mask = (1  shift) - 1;
 
  -Ulf
 
 
 



Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString

2013-03-20 Thread Steven Schlansker

On Feb 13, 2013, at 2:45 PM, Mike Duigou mike.dui...@oracle.com wrote:

 I have updated the patch with some of Ulf's feedback and corrected one 
 cut-and-paste error that I made.
 
 The updated webrev is at:
 
 http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/
 

One last ping to make sure this doesn't slip through the cracks, as I haven't 
seen the commit go through the public JDK8 repo…  or maybe I missed it.

Thanks again for looking at this :-)

 Mike
 
 On Feb 12 2013, at 18:25 , Ulf Zibis wrote:
 
 Am 13.02.2013 02:34, schrieb Mike Duigou:
 Thank you for the comments Ulf.
 
 On Feb 12 2013, at 17:24 , Ulf Zibis wrote:
 
 Am 13.02.2013 00:30, schrieb Mike Duigou:
 Hi Steven;
 
 I have updated the patch for Java 8. There's somewhat less code sharing 
 and a bit of refactoring than your last version but the performance 
 should be about the same or a little better.
 
 http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/
 Couldn't you use String(buf, true) for all to(Unsigned)String(...) methods 
 ?
 Possibly. I didn't go looking too far with looking for additional 
 improvements.
 
 Instead of calculating the mask each time, you could use:
 309 private static String toUnsignedString(int i, int shift, int mask) 
 {
 I think that would actually be inefficient. I haven't looked at the JITed 
 code but the mask calculation is pretty cheap relative to parameter passing.
 
 I believe, JIT will inline the code, so there would be no parameter passing.
 
 Additionally the calculation of char count could be faster, if you would 
 have short cuts like:
 numberOfLeading2Zeros(i)
 numberOfLeading4Zeros(i)
 numberOfLeading8Zeros(i)
 ...
 So the optimum would be with separate toString methods:
 String toBase2String(int i);
 String toBase4String(int i);
 String toBase8String(int i);
 ...
 
 In any case I would save 2 lines:
 
 311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i);
 312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1);
 313 char[] buf = new char[charCount];
 316 int mask = (1  shift) - 1;
 
 -Ulf
 
 



Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString

2013-03-20 Thread Mike Duigou
Hi Steven;

I haven't forgotten and it's still in my queue for Java 8. I haven't had time 
to respond to Martin Buchholz's feedback.

http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-February/014528.html

I suspect he is right about computing digits vs the table being a better choice 
but I would want to test that first with a proper microbenchmark (and then fix 
a bunch of other places that use the same table).

If you have time to follow up on any of this things will go quicker. :-)

Mike

On Mar 20 2013, at 15:39 , Steven Schlansker wrote:

 
 On Feb 13, 2013, at 2:45 PM, Mike Duigou mike.dui...@oracle.com wrote:
 
 I have updated the patch with some of Ulf's feedback and corrected one 
 cut-and-paste error that I made.
 
 The updated webrev is at:
 
 http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/
 
 
 One last ping to make sure this doesn't slip through the cracks, as I haven't 
 seen the commit go through the public JDK8 repo…  or maybe I missed it.
 
 Thanks again for looking at this :-)
 
 Mike
 
 On Feb 12 2013, at 18:25 , Ulf Zibis wrote:
 
 Am 13.02.2013 02:34, schrieb Mike Duigou:
 Thank you for the comments Ulf.
 
 On Feb 12 2013, at 17:24 , Ulf Zibis wrote:
 
 Am 13.02.2013 00:30, schrieb Mike Duigou:
 Hi Steven;
 
 I have updated the patch for Java 8. There's somewhat less code sharing 
 and a bit of refactoring than your last version but the performance 
 should be about the same or a little better.
 
 http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/
 Couldn't you use String(buf, true) for all to(Unsigned)String(...) 
 methods ?
 Possibly. I didn't go looking too far with looking for additional 
 improvements.
 
 Instead of calculating the mask each time, you could use:
 309 private static String toUnsignedString(int i, int shift, int 
 mask) {
 I think that would actually be inefficient. I haven't looked at the JITed 
 code but the mask calculation is pretty cheap relative to parameter 
 passing.
 
 I believe, JIT will inline the code, so there would be no parameter passing.
 
 Additionally the calculation of char count could be faster, if you would 
 have short cuts like:
 numberOfLeading2Zeros(i)
 numberOfLeading4Zeros(i)
 numberOfLeading8Zeros(i)
 ...
 So the optimum would be with separate toString methods:
 String toBase2String(int i);
 String toBase4String(int i);
 String toBase8String(int i);
 ...
 
 In any case I would save 2 lines:
 
 311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i);
 312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1);
 313 char[] buf = new char[charCount];
 316 int mask = (1  shift) - 1;
 
 -Ulf
 
 
 



Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString

2013-02-13 Thread Mike Duigou
I have updated the patch with some of Ulf's feedback and corrected one 
cut-and-paste error that I made.

The updated webrev is at:

http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/

Mike

On Feb 12 2013, at 18:25 , Ulf Zibis wrote:

 Am 13.02.2013 02:34, schrieb Mike Duigou:
 Thank you for the comments Ulf.
 
 On Feb 12 2013, at 17:24 , Ulf Zibis wrote:
 
 Am 13.02.2013 00:30, schrieb Mike Duigou:
 Hi Steven;
 
 I have updated the patch for Java 8. There's somewhat less code sharing 
 and a bit of refactoring than your last version but the performance should 
 be about the same or a little better.
 
 http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/
 Couldn't you use String(buf, true) for all to(Unsigned)String(...) methods ?
 Possibly. I didn't go looking too far with looking for additional 
 improvements.
 
 Instead of calculating the mask each time, you could use:
 309 private static String toUnsignedString(int i, int shift, int mask) {
 I think that would actually be inefficient. I haven't looked at the JITed 
 code but the mask calculation is pretty cheap relative to parameter passing.
 
 I believe, JIT will inline the code, so there would be no parameter passing.
 
 Additionally the calculation of char count could be faster, if you would have 
 short cuts like:
 numberOfLeading2Zeros(i)
 numberOfLeading4Zeros(i)
 numberOfLeading8Zeros(i)
 ...
 So the optimum would be with separate toString methods:
 String toBase2String(int i);
 String toBase4String(int i);
 String toBase8String(int i);
 ...
 
 In any case I would save 2 lines:
 
 311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i);
 312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1);
 313 char[] buf = new char[charCount];
 316 int mask = (1  shift) - 1;
 
 -Ulf
 



Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString

2013-02-13 Thread Ulf Zibis

Am 13.02.2013 23:45, schrieb Mike Duigou:

I have updated the patch with some of Ulf's feedback and corrected one 
cut-and-paste error that I made.

The updated webrev is at:

http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/


Fine, but I still see the typo in j.l.System.

-Ulf



Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString

2013-02-13 Thread Martin Buchholz
I like the optimizations in this change, especially the avoidance of
copies. Was there some good reason the jdk never before used JavaLangAccess
to avoid String creation copies?

 I am tempted to micro-optimize this some more, e.g. specialize the hex
printing code.  I might get rid of the digits table and compute ASCII
characters simply:

((i  10) ? '0' : ('a' - 10)) + i

Why not formatUnsignedInt?
Why not make the new format methods public?

Is there a better name than createStringSharedChars?  We hope those chars
will *not* be shared!  How about newStringUnsafe(char[] chars)

The spec for
+int formatUnsignedLong(long val, int shift, char[] buf, int offset,
int len);
should make it clearer whose responsibility getting the size right is.  It
looks like the caller has to ensure there will be enough space, so perhaps
the caller should just pass one offset instead of offset plus len.

+ * @return the lowest character  location used
stray space



On Wed, Feb 13, 2013 at 2:45 PM, Mike Duigou mike.dui...@oracle.com wrote:

 I have updated the patch with some of Ulf's feedback and corrected one
 cut-and-paste error that I made.

 The updated webrev is at:

 http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/

 Mike

 On Feb 12 2013, at 18:25 , Ulf Zibis wrote:

  Am 13.02.2013 02:34, schrieb Mike Duigou:
  Thank you for the comments Ulf.
 
  On Feb 12 2013, at 17:24 , Ulf Zibis wrote:
 
  Am 13.02.2013 00:30, schrieb Mike Duigou:
  Hi Steven;
 
  I have updated the patch for Java 8. There's somewhat less code
 sharing and a bit of refactoring than your last version but the performance
 should be about the same or a little better.
 
  http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/
  Couldn't you use String(buf, true) for all to(Unsigned)String(...)
 methods ?
  Possibly. I didn't go looking too far with looking for additional
 improvements.
 
  Instead of calculating the mask each time, you could use:
  309 private static String toUnsignedString(int i, int shift, int
 mask) {
  I think that would actually be inefficient. I haven't looked at the
 JITed code but the mask calculation is pretty cheap relative to parameter
 passing.
 
  I believe, JIT will inline the code, so there would be no parameter
 passing.
 
  Additionally the calculation of char count could be faster, if you would
 have short cuts like:
  numberOfLeading2Zeros(i)
  numberOfLeading4Zeros(i)
  numberOfLeading8Zeros(i)
  ...
  So the optimum would be with separate toString methods:
  String toBase2String(int i);
  String toBase4String(int i);
  String toBase8String(int i);
  ...
 
  In any case I would save 2 lines:
 
  311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i);
  312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1);
  313 char[] buf = new char[charCount];
  316 int mask = (1  shift) - 1;
 
  -Ulf
 




Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString

2013-02-12 Thread Mike Duigou
Hi Steven;

I have updated the patch for Java 8. There's somewhat less code sharing and a 
bit of refactoring than your last version but the performance should be about 
the same or a little better.

http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/

I am currently running the jtreg regression suite across the changes.

Mike



On Feb 11 2013, at 21:18 , Steven Schlansker wrote:

 On Feb 11, 2013, at 9:04 PM, Mike Duigou mike.dui...@oracle.com wrote:
 
 I have been following up on this issue. I am going to have to adapt the code 
 a bit because there have been some changes in JDK8.
 
 I won't forget this issue though. It is possible I may not have time to 
 backport it to Java 7.
 
 
 Please let me know if I can be of any help in a way that doesn't actually 
 cost more work than I end up contributing :-)
 
 Mile
 
 On Feb 3 2013, at 21:26 , Steven Schlansker wrote:
 
 
 On Feb 1, 2013, at 11:42 AM, Mike Duigou mike.dui...@oracle.com wrote:
 
 I have created another issue 8007398 for the changes to Long. We can even 
 test and push the two issues at the same time. Separating them into two 
 changesets makes the intent easier to follow for future maintainers.
 
 We can use the same webrev. There's no need to create another.
 
 - I would like to see if performed of toString() can be improved further 
 by using String(char[] value, boolean share) constructor via a 
 sun.miscSharedSecret.JavaLangAccesss method to construct the string 
 directly from the character array. You could test to see if this has 
 positive benefit by temporarily using a static char array.
 
 I will incorporate this into my next revision
 
 
 - public static String toString(long msb, long lsb) should be private. 
 There's no compelling reason to add this to the API.
 
 - Have you run this code against any of the existing regression tests?
 
 Yes, I ran the jtreg UUID and Long tests, all pass.  I ran the Apache 
 Harmony UUID test cases against the pre-integrated version of the code.  
 (There should only have been minor modifications since then, variable 
 renamings, whatnot…)
 
 OK, once we have a final webrev then I will run final tests and push this!
 
 Hi,
 
 Here are the updates to the webrev.  I hope the changes are in line with 
 what you'd had in mind:
 http://dl.dropbox.com/u/1422321/uuid_webrev/index.html
 http://dl.dropbox.com/u/1422321/uuid_webrev.zip
 
 Please let me know if there are any further modifications I should make.
 
 Thanks!
 Steven
 
 
 



Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString

2013-02-12 Thread Mike Duigou
Thank you for the comments Ulf.

On Feb 12 2013, at 17:24 , Ulf Zibis wrote:

 Am 13.02.2013 00:30, schrieb Mike Duigou:
 Hi Steven;
 
 I have updated the patch for Java 8. There's somewhat less code sharing and 
 a bit of refactoring than your last version but the performance should be 
 about the same or a little better.
 
 http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/
 
 Couldn't you use String(buf, true) for all to(Unsigned)String(...) methods ?

Possibly. I didn't go looking too far with looking for additional improvements.

 Instead of calculating the mask each time, you could use:
 309 private static String toUnsignedString(int i, int shift, int mask) {

I think that would actually be inefficient. I haven't looked at the JITed code 
but the mask calculation is pretty cheap relative to parameter passing.

 Additionally in e.g. Integer.toString(int i, int radix) you could provide the 
 fast version with:

I think the only change I will make is to Integer.toUnsignedString() which 
becomes :
return Long.toUnsignedString(toUnsignedLong(i), radix);

Mike



Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString

2013-02-12 Thread Ulf Zibis


Am 13.02.2013 00:30, schrieb Mike Duigou:

Hi Steven;

I have updated the patch for Java 8. There's somewhat less code sharing and a 
bit of refactoring than your last version but the performance should be about 
the same or a little better.

http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/


In java.lang.System:
There is a space missing before '{'
1189 sun.misc.SharedSecrets.setJavaLangAccess(new 
sun.misc.JavaLangAccess(){

Additionally I more would like clazz instead klass for following parameters.

-Ulf




Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString

2013-02-12 Thread Ulf Zibis

Am 13.02.2013 02:34, schrieb Mike Duigou:

Thank you for the comments Ulf.

On Feb 12 2013, at 17:24 , Ulf Zibis wrote:


Am 13.02.2013 00:30, schrieb Mike Duigou:

Hi Steven;

I have updated the patch for Java 8. There's somewhat less code sharing and a 
bit of refactoring than your last version but the performance should be about 
the same or a little better.

http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/

Couldn't you use String(buf, true) for all to(Unsigned)String(...) methods ?

Possibly. I didn't go looking too far with looking for additional improvements.


Instead of calculating the mask each time, you could use:
309 private static String toUnsignedString(int i, int shift, int mask) {

I think that would actually be inefficient. I haven't looked at the JITed code 
but the mask calculation is pretty cheap relative to parameter passing.


I believe, JIT will inline the code, so there would be no parameter passing.

Additionally the calculation of char count could be faster, if you would have 
short cuts like:
numberOfLeading2Zeros(i)
numberOfLeading4Zeros(i)
numberOfLeading8Zeros(i)
...
So the optimum would be with separate toString methods:
String toBase2String(int i);
String toBase4String(int i);
String toBase8String(int i);
...

In any case I would save 2 lines:

 311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i);
 312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1);
 313 char[] buf = new char[charCount];
 316 int mask = (1  shift) - 1;

-Ulf