Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString
Finally some follow up on this! On Feb 13 2013, at 23:43 , Martin Buchholz wrote: I like the optimizations in this change, especially the avoidance of copies. Was there some good reason the jdk never before used JavaLangAccess to avoid String creation copies? Not that I am aware of. I did work on String and thereafter the idea occurred to me. Copying...we hates it. I am tempted to micro-optimize this some more, e.g. specialize the hex printing code. I might get rid of the digits table and compute ASCII characters simply: ((i 10) ? '0' : ('a' - 10)) + i I wanted to play with JMH so used this as an excuse. It turns out that using a calculation is a performance loser: Benchmark ThrCnt Sec Mean Mean error VarUnits o.o.j.s.g.t.DigitsTableVsCalc.digitsCalc 1 81 8463376.310 2516.169 4136965.190 ops/sec o.o.j.s.g.t.DigitsTableVsCalc.digitsTable 1 81 11838194.938 4936.538 15923812.479 ops/sec The source of the benchmarks: @OutputTimeUnit(TimeUnit.SECONDS) @GenerateMicroBenchmark(BenchmarkType.OpsPerTimeUnit) public void digitsTable() throws InterruptedException { int end = testData.length; for (int each = 0; each end; each++) { int i = testData[each]; output[each] = digits[i]; } } @OutputTimeUnit(TimeUnit.SECONDS) @GenerateMicroBenchmark(BenchmarkType.OpsPerTimeUnit) public void digitsCalc() throws InterruptedException { int end = testData.length; for (int each = 0; each end; each++) { int i = testData[each]; int base = i 10 ? 48 : 87; output[each] = (char) (base + i); } } Where testData is a 100 element static int array of random digit values (0-35). Why not formatUnsignedInt? Now added and it seems to help. Why not make the new format methods public? They seemed too specialized to me, particularly with the constraints on radix. Is there a better name than createStringSharedChars? We hope those chars will *not* be shared! How about newStringUnsafe(char[] chars) The spec for +int formatUnsignedLong(long val, int shift, char[] buf, int offset, int len); should make it clearer whose responsibility getting the size right is. It looks like the caller has to ensure there will be enough space, so perhaps the caller should just pass one offset instead of offset plus len. Seems reasonable. + * @return the lowest character location used stray space Corrected. On Wed, Feb 13, 2013 at 2:45 PM, Mike Duigou mike.dui...@oracle.com wrote: I have updated the patch with some of Ulf's feedback and corrected one cut-and-paste error that I made. The updated webrev is at: http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/ Mike On Feb 12 2013, at 18:25 , Ulf Zibis wrote: Am 13.02.2013 02:34, schrieb Mike Duigou: Thank you for the comments Ulf. On Feb 12 2013, at 17:24 , Ulf Zibis wrote: Am 13.02.2013 00:30, schrieb Mike Duigou: Hi Steven; I have updated the patch for Java 8. There's somewhat less code sharing and a bit of refactoring than your last version but the performance should be about the same or a little better. http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/ Couldn't you use String(buf, true) for all to(Unsigned)String(...) methods ? Possibly. I didn't go looking too far with looking for additional improvements. Instead of calculating the mask each time, you could use: 309 private static String toUnsignedString(int i, int shift, int mask) { I think that would actually be inefficient. I haven't looked at the JITed code but the mask calculation is pretty cheap relative to parameter passing. I believe, JIT will inline the code, so there would be no parameter passing. Additionally the calculation of char count could be faster, if you would have short cuts like: numberOfLeading2Zeros(i) numberOfLeading4Zeros(i) numberOfLeading8Zeros(i) ... So the optimum would be with separate toString methods: String toBase2String(int i); String toBase4String(int i); String toBase8String(int i); ... In any case I would save 2 lines: 311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i); 312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1); 313 char[] buf = new char[charCount]; 316 int mask = (1 shift) - 1; -Ulf
Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString
On Feb 13, 2013, at 2:45 PM, Mike Duigou mike.dui...@oracle.com wrote: I have updated the patch with some of Ulf's feedback and corrected one cut-and-paste error that I made. The updated webrev is at: http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/ One last ping to make sure this doesn't slip through the cracks, as I haven't seen the commit go through the public JDK8 repo… or maybe I missed it. Thanks again for looking at this :-) Mike On Feb 12 2013, at 18:25 , Ulf Zibis wrote: Am 13.02.2013 02:34, schrieb Mike Duigou: Thank you for the comments Ulf. On Feb 12 2013, at 17:24 , Ulf Zibis wrote: Am 13.02.2013 00:30, schrieb Mike Duigou: Hi Steven; I have updated the patch for Java 8. There's somewhat less code sharing and a bit of refactoring than your last version but the performance should be about the same or a little better. http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/ Couldn't you use String(buf, true) for all to(Unsigned)String(...) methods ? Possibly. I didn't go looking too far with looking for additional improvements. Instead of calculating the mask each time, you could use: 309 private static String toUnsignedString(int i, int shift, int mask) { I think that would actually be inefficient. I haven't looked at the JITed code but the mask calculation is pretty cheap relative to parameter passing. I believe, JIT will inline the code, so there would be no parameter passing. Additionally the calculation of char count could be faster, if you would have short cuts like: numberOfLeading2Zeros(i) numberOfLeading4Zeros(i) numberOfLeading8Zeros(i) ... So the optimum would be with separate toString methods: String toBase2String(int i); String toBase4String(int i); String toBase8String(int i); ... In any case I would save 2 lines: 311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i); 312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1); 313 char[] buf = new char[charCount]; 316 int mask = (1 shift) - 1; -Ulf
Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString
Hi Steven; I haven't forgotten and it's still in my queue for Java 8. I haven't had time to respond to Martin Buchholz's feedback. http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-February/014528.html I suspect he is right about computing digits vs the table being a better choice but I would want to test that first with a proper microbenchmark (and then fix a bunch of other places that use the same table). If you have time to follow up on any of this things will go quicker. :-) Mike On Mar 20 2013, at 15:39 , Steven Schlansker wrote: On Feb 13, 2013, at 2:45 PM, Mike Duigou mike.dui...@oracle.com wrote: I have updated the patch with some of Ulf's feedback and corrected one cut-and-paste error that I made. The updated webrev is at: http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/ One last ping to make sure this doesn't slip through the cracks, as I haven't seen the commit go through the public JDK8 repo… or maybe I missed it. Thanks again for looking at this :-) Mike On Feb 12 2013, at 18:25 , Ulf Zibis wrote: Am 13.02.2013 02:34, schrieb Mike Duigou: Thank you for the comments Ulf. On Feb 12 2013, at 17:24 , Ulf Zibis wrote: Am 13.02.2013 00:30, schrieb Mike Duigou: Hi Steven; I have updated the patch for Java 8. There's somewhat less code sharing and a bit of refactoring than your last version but the performance should be about the same or a little better. http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/ Couldn't you use String(buf, true) for all to(Unsigned)String(...) methods ? Possibly. I didn't go looking too far with looking for additional improvements. Instead of calculating the mask each time, you could use: 309 private static String toUnsignedString(int i, int shift, int mask) { I think that would actually be inefficient. I haven't looked at the JITed code but the mask calculation is pretty cheap relative to parameter passing. I believe, JIT will inline the code, so there would be no parameter passing. Additionally the calculation of char count could be faster, if you would have short cuts like: numberOfLeading2Zeros(i) numberOfLeading4Zeros(i) numberOfLeading8Zeros(i) ... So the optimum would be with separate toString methods: String toBase2String(int i); String toBase4String(int i); String toBase8String(int i); ... In any case I would save 2 lines: 311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i); 312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1); 313 char[] buf = new char[charCount]; 316 int mask = (1 shift) - 1; -Ulf
Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString
I have updated the patch with some of Ulf's feedback and corrected one cut-and-paste error that I made. The updated webrev is at: http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/ Mike On Feb 12 2013, at 18:25 , Ulf Zibis wrote: Am 13.02.2013 02:34, schrieb Mike Duigou: Thank you for the comments Ulf. On Feb 12 2013, at 17:24 , Ulf Zibis wrote: Am 13.02.2013 00:30, schrieb Mike Duigou: Hi Steven; I have updated the patch for Java 8. There's somewhat less code sharing and a bit of refactoring than your last version but the performance should be about the same or a little better. http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/ Couldn't you use String(buf, true) for all to(Unsigned)String(...) methods ? Possibly. I didn't go looking too far with looking for additional improvements. Instead of calculating the mask each time, you could use: 309 private static String toUnsignedString(int i, int shift, int mask) { I think that would actually be inefficient. I haven't looked at the JITed code but the mask calculation is pretty cheap relative to parameter passing. I believe, JIT will inline the code, so there would be no parameter passing. Additionally the calculation of char count could be faster, if you would have short cuts like: numberOfLeading2Zeros(i) numberOfLeading4Zeros(i) numberOfLeading8Zeros(i) ... So the optimum would be with separate toString methods: String toBase2String(int i); String toBase4String(int i); String toBase8String(int i); ... In any case I would save 2 lines: 311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i); 312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1); 313 char[] buf = new char[charCount]; 316 int mask = (1 shift) - 1; -Ulf
Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString
Am 13.02.2013 23:45, schrieb Mike Duigou: I have updated the patch with some of Ulf's feedback and corrected one cut-and-paste error that I made. The updated webrev is at: http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/ Fine, but I still see the typo in j.l.System. -Ulf
Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString
I like the optimizations in this change, especially the avoidance of copies. Was there some good reason the jdk never before used JavaLangAccess to avoid String creation copies? I am tempted to micro-optimize this some more, e.g. specialize the hex printing code. I might get rid of the digits table and compute ASCII characters simply: ((i 10) ? '0' : ('a' - 10)) + i Why not formatUnsignedInt? Why not make the new format methods public? Is there a better name than createStringSharedChars? We hope those chars will *not* be shared! How about newStringUnsafe(char[] chars) The spec for +int formatUnsignedLong(long val, int shift, char[] buf, int offset, int len); should make it clearer whose responsibility getting the size right is. It looks like the caller has to ensure there will be enough space, so perhaps the caller should just pass one offset instead of offset plus len. + * @return the lowest character location used stray space On Wed, Feb 13, 2013 at 2:45 PM, Mike Duigou mike.dui...@oracle.com wrote: I have updated the patch with some of Ulf's feedback and corrected one cut-and-paste error that I made. The updated webrev is at: http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/ Mike On Feb 12 2013, at 18:25 , Ulf Zibis wrote: Am 13.02.2013 02:34, schrieb Mike Duigou: Thank you for the comments Ulf. On Feb 12 2013, at 17:24 , Ulf Zibis wrote: Am 13.02.2013 00:30, schrieb Mike Duigou: Hi Steven; I have updated the patch for Java 8. There's somewhat less code sharing and a bit of refactoring than your last version but the performance should be about the same or a little better. http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/ Couldn't you use String(buf, true) for all to(Unsigned)String(...) methods ? Possibly. I didn't go looking too far with looking for additional improvements. Instead of calculating the mask each time, you could use: 309 private static String toUnsignedString(int i, int shift, int mask) { I think that would actually be inefficient. I haven't looked at the JITed code but the mask calculation is pretty cheap relative to parameter passing. I believe, JIT will inline the code, so there would be no parameter passing. Additionally the calculation of char count could be faster, if you would have short cuts like: numberOfLeading2Zeros(i) numberOfLeading4Zeros(i) numberOfLeading8Zeros(i) ... So the optimum would be with separate toString methods: String toBase2String(int i); String toBase4String(int i); String toBase8String(int i); ... In any case I would save 2 lines: 311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i); 312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1); 313 char[] buf = new char[charCount]; 316 int mask = (1 shift) - 1; -Ulf
Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString
Hi Steven; I have updated the patch for Java 8. There's somewhat less code sharing and a bit of refactoring than your last version but the performance should be about the same or a little better. http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/ I am currently running the jtreg regression suite across the changes. Mike On Feb 11 2013, at 21:18 , Steven Schlansker wrote: On Feb 11, 2013, at 9:04 PM, Mike Duigou mike.dui...@oracle.com wrote: I have been following up on this issue. I am going to have to adapt the code a bit because there have been some changes in JDK8. I won't forget this issue though. It is possible I may not have time to backport it to Java 7. Please let me know if I can be of any help in a way that doesn't actually cost more work than I end up contributing :-) Mile On Feb 3 2013, at 21:26 , Steven Schlansker wrote: On Feb 1, 2013, at 11:42 AM, Mike Duigou mike.dui...@oracle.com wrote: I have created another issue 8007398 for the changes to Long. We can even test and push the two issues at the same time. Separating them into two changesets makes the intent easier to follow for future maintainers. We can use the same webrev. There's no need to create another. - I would like to see if performed of toString() can be improved further by using String(char[] value, boolean share) constructor via a sun.miscSharedSecret.JavaLangAccesss method to construct the string directly from the character array. You could test to see if this has positive benefit by temporarily using a static char array. I will incorporate this into my next revision - public static String toString(long msb, long lsb) should be private. There's no compelling reason to add this to the API. - Have you run this code against any of the existing regression tests? Yes, I ran the jtreg UUID and Long tests, all pass. I ran the Apache Harmony UUID test cases against the pre-integrated version of the code. (There should only have been minor modifications since then, variable renamings, whatnot…) OK, once we have a final webrev then I will run final tests and push this! Hi, Here are the updates to the webrev. I hope the changes are in line with what you'd had in mind: http://dl.dropbox.com/u/1422321/uuid_webrev/index.html http://dl.dropbox.com/u/1422321/uuid_webrev.zip Please let me know if there are any further modifications I should make. Thanks! Steven
Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString
Thank you for the comments Ulf. On Feb 12 2013, at 17:24 , Ulf Zibis wrote: Am 13.02.2013 00:30, schrieb Mike Duigou: Hi Steven; I have updated the patch for Java 8. There's somewhat less code sharing and a bit of refactoring than your last version but the performance should be about the same or a little better. http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/ Couldn't you use String(buf, true) for all to(Unsigned)String(...) methods ? Possibly. I didn't go looking too far with looking for additional improvements. Instead of calculating the mask each time, you could use: 309 private static String toUnsignedString(int i, int shift, int mask) { I think that would actually be inefficient. I haven't looked at the JITed code but the mask calculation is pretty cheap relative to parameter passing. Additionally in e.g. Integer.toString(int i, int radix) you could provide the fast version with: I think the only change I will make is to Integer.toUnsignedString() which becomes : return Long.toUnsignedString(toUnsignedLong(i), radix); Mike
Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString
Am 13.02.2013 00:30, schrieb Mike Duigou: Hi Steven; I have updated the patch for Java 8. There's somewhat less code sharing and a bit of refactoring than your last version but the performance should be about the same or a little better. http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/ In java.lang.System: There is a space missing before '{' 1189 sun.misc.SharedSecrets.setJavaLangAccess(new sun.misc.JavaLangAccess(){ Additionally I more would like clazz instead klass for following parameters. -Ulf
Re: RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString
Am 13.02.2013 02:34, schrieb Mike Duigou: Thank you for the comments Ulf. On Feb 12 2013, at 17:24 , Ulf Zibis wrote: Am 13.02.2013 00:30, schrieb Mike Duigou: Hi Steven; I have updated the patch for Java 8. There's somewhat less code sharing and a bit of refactoring than your last version but the performance should be about the same or a little better. http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/ Couldn't you use String(buf, true) for all to(Unsigned)String(...) methods ? Possibly. I didn't go looking too far with looking for additional improvements. Instead of calculating the mask each time, you could use: 309 private static String toUnsignedString(int i, int shift, int mask) { I think that would actually be inefficient. I haven't looked at the JITed code but the mask calculation is pretty cheap relative to parameter passing. I believe, JIT will inline the code, so there would be no parameter passing. Additionally the calculation of char count could be faster, if you would have short cuts like: numberOfLeading2Zeros(i) numberOfLeading4Zeros(i) numberOfLeading8Zeros(i) ... So the optimum would be with separate toString methods: String toBase2String(int i); String toBase4String(int i); String toBase8String(int i); ... In any case I would save 2 lines: 311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i); 312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1); 313 char[] buf = new char[charCount]; 316 int mask = (1 shift) - 1; -Ulf