I apologize for the lack of reply. I had missed this thread while on holiday.
As Aleksey suggests the first step is to complete the OCA/CLA. Until that step is complete we won't be able to act upon this contribution in any formal way. I encourage you to update the patch with any specific feedback you receive. Thanks in advance for this contribution! Mike On Jan 9 2013, at 09:51 , Steven Schlansker wrote: > Hello again, > > I sent this email a week ago and have received no replies. Is there any step > I have missed necessary to contribute to the JDK libraries? > > I am very interested in making your lives easier, so please let me know if I > am in the wrong place or are otherwise misguided. > > Thanks! > Steven > > > On Dec 29, 2012, at 9:25 AM, Steven Schlansker <stevenschlans...@gmail.com> > wrote: > >> Hello core-libs-dev, >> >> My company uses UUIDs throughout our software. We recently identified that >> the java.util.UUID implementations of fromString and toString are horridly >> inefficient. >> >> An incomplete list of the inefficiencies: >> >> * fromString uses a regular expression that is not even precompiled >> * fromString uses a regular expression to parse out the "-" markers, >> requiring the allocation of many String and String[] objects, when a simple >> linear scan works just fine >> * fromString unnecessarily allocates multiple Long objects >> >> * toString creates a char[64], a String object which copies that, and a >> sub-String object for *each* of the 5 hex digit segments >> * toString produces a fixed-length result but involves multiple >> StringBuilder concatenations >> >> Everyone that I have shown the relevant code to has been horrified by the >> complete lack of care towards object allocations. While I am a big fan of >> the "object allocation is free until otherwise proven" philosophy, we traced >> multiple production problems down to these hotspots. >> >> But instead of just whining about it, I wish to contribute a patch which I >> believe fixes the problem. This is my first attempt to submit anything to >> the JDK, so please be nice :-) >> >> My initial attempt has yielded micro benchmarks with very favorable >> outcomes. By taking the appropriate code from java.lang.Long, modifying it >> to work on a single pre-allocated array+offset rather than returning a >> String, and scanning for dashes instead of regex splitting I reduced times >> 3-6x and object allocations to the low single digits. >> >> My Google Caliper benchmark is available here, to ensure that I have not >> committed any benchmark sins: >> https://gist.github.com/4356621 >> >> benchmark instances Bytes ns linear runtime >> JdkUuidFromString 51.00 1544.0 608.2 ============================== >> NewUuidFromString 2.00 72.0 179.1 ======== >> >> JdkUuidToString 31.00 1720.0 321.4 =============== >> NewUuidToString 3.00 200.0 51.5 == >> >> In particular, the reduction of object allocations from ~1.5KB to 72/200 >> bytes dramatically reduces GC pressure when you sit in a tight loop >> deserializing millions of UUIDs from strings. >> >> I believe the same patch can (and should?) apply to jdk7u and jdk8, as the >> code does not seem to have changed. >> >> I would appreciate any feedback on the code style, efficiency, or >> correctness that you can offer. I have run the "java/util/UUID" jtreg suite >> against this and it passes. We stole the more thorough Apache Harmony tests >> for this and they pass as well: >> https://github.com/stevenschlansker/components-ness-core/blob/master/src/test/java/com/nesscomputing/uuid/TestNessUUID.java >> >> I have not yet secured a CLA, but my company is in the process of signing >> one. >> >> The rest of the message is a "hg export" of my change set, which is current >> to the tip of jdk7. >> Happy holidays, and thank you for your time! >> Steven Schlansker >> >> >> >> >> # HG changeset patch >> # User Steven Schlansker <ste...@nesscomputing.com> >> # Date 1356737383 0 >> # Node ID a812c963b96f08112f6805cbc380b12af196f788 >> # Parent 9b8c96f96a0f9a5801b55530a387fefbe50482a3 >> java.util.UUID: improve performance of UUID#toString and UUID#fromString by >> avoiding many unnecessary object allocations. >> >> benchmark instances Bytes ns linear runtime >> JdkUuidFromString 51.00 1544.0 608.2 ============================== >> NewUuidFromString 2.00 72.0 179.1 ======== >> >> JdkUuidToString 31.00 1720.0 321.4 =============== >> NewUuidToString 3.00 200.0 51.5 == >> >> diff -r 9b8c96f96a0f -r a812c963b96f src/share/classes/java/util/UUID.java >> --- a/src/share/classes/java/util/UUID.java Mon Jun 27 13:21:34 2011 -0700 >> +++ b/src/share/classes/java/util/UUID.java Fri Dec 28 23:29:43 2012 +0000 >> @@ -189,26 +189,79 @@ >> * described in {@link #toString} >> * >> */ >> - public static UUID fromString(String name) { >> - String[] components = name.split("-"); >> - if (components.length != 5) >> - throw new IllegalArgumentException("Invalid UUID string: >> "+name); >> - for (int i=0; i<5; i++) >> - components[i] = "0x"+components[i]; >> + public static UUID fromString(String str) { >> + int dashCount = 4; >> + int [] dashPos = new int [6]; >> + dashPos[0] = -1; >> + dashPos[5] = str.length(); >> >> - long mostSigBits = Long.decode(components[0]).longValue(); >> + for (int i = str.length()-1; i >= 0; i--) { >> + if (str.charAt(i) == '-') { >> + if (dashCount == 0) { >> + throw new IllegalArgumentException("Invalid UUID >> string: " + str); >> + } >> + dashPos[dashCount--] = i; >> + } >> + } >> + >> + if (dashCount > 0) { >> + throw new IllegalArgumentException("Invalid UUID string: " + >> str); >> + } >> + >> + long mostSigBits = decode(str, dashPos, 0) & 0xffffffffL; >> mostSigBits <<= 16; >> - mostSigBits |= Long.decode(components[1]).longValue(); >> + mostSigBits |= decode(str, dashPos, 1) & 0xffffL; >> mostSigBits <<= 16; >> - mostSigBits |= Long.decode(components[2]).longValue(); >> + mostSigBits |= decode(str, dashPos, 2) & 0xffff); >> >> - long leastSigBits = Long.decode(components[3]).longValue(); >> + long leastSigBits = decode(str, dashPos, 3) & 0xffffL; >> leastSigBits <<= 48; >> - leastSigBits |= Long.decode(components[4]).longValue(); >> + leastSigBits |= decode(str, dashPos, 4) & 0xffffffffffffL; >> >> return new UUID(mostSigBits, leastSigBits); >> } >> >> + private static long decode(final String str, final int [] dashPos, >> final int field) { >> + int start = dashPos[field]+1; >> + int end = dashPos[field+1]; >> + if (start >= end) { >> + throw new IllegalArgumentException("Invalid UUID string: " + >> str); >> + } >> + long curr = 0; >> + for (int i = start; i < end; i++) { >> + int x = getNibbleFromChar(str.charAt(i)); >> + curr <<= 4; >> + if (curr < 0) { >> + throw new NumberFormatException("long overflow"); >> + } >> + curr |= x; >> + } >> + return curr; >> + } >> + >> + private static final int NUM_ALPHA_DIFF = 'A' - '9' - 1; >> + private static final int LOWER_UPPER_DIFF = 'a' - 'A'; >> + >> + private static int getNibbleFromChar(final char c) >> + { >> + int x = c - '0'; >> + if (x > 9) { >> + x -= NUM_ALPHA_DIFF; // difference between '9' and 'A' >> + if (x > 15) { >> + x -= LOWER_UPPER_DIFF; // difference between 'a' and 'A' >> + } >> + if (x < 10) { >> + throw new IllegalArgumentException(c + " is not a valid >> character for an UUID string"); >> + } >> + } >> + >> + if (x < 0 || x > 15) { >> + throw new IllegalArgumentException(c + " is not a valid >> character for an UUID string"); >> + } >> + >> + return x; >> + } >> + >> // Field Accessor Methods >> >> /** >> @@ -372,19 +425,46 @@ >> * @return A string representation of this {@code UUID} >> */ >> public String toString() { >> - return (digits(mostSigBits >> 32, 8) + "-" + >> - digits(mostSigBits >> 16, 4) + "-" + >> - digits(mostSigBits, 4) + "-" + >> - digits(leastSigBits >> 48, 4) + "-" + >> - digits(leastSigBits, 12)); >> + return toString(getMostSignificantBits(), >> getLeastSignificantBits()); >> } >> >> - /** Returns val represented by the specified number of hex digits. */ >> - private static String digits(long val, int digits) { >> - long hi = 1L << (digits * 4); >> - return Long.toHexString(hi | (val & (hi - 1))).substring(1); >> + public static String toString(long msb, long lsb) { >> + char[] uuidChars = new char[36]; >> + >> + hexDigits(uuidChars, 0, 8, msb >> 32); >> + uuidChars[8] = '-'; >> + hexDigits(uuidChars, 9, 4, msb >> 16); >> + uuidChars[13] = '-'; >> + hexDigits(uuidChars, 14, 4, msb); >> + uuidChars[18] = '-'; >> + hexDigits(uuidChars, 19, 4, lsb >> 48); >> + uuidChars[23] = '-'; >> + hexDigits(uuidChars, 24, 12, lsb); >> + >> + return new String(uuidChars); >> } >> >> + private static void hexDigits(char[] dest, int offset, int digits, long >> val) { >> + long hi = 1L << (digits * 4); >> + toUnsignedString(dest, offset, digits, hi | (val & (hi - 1)), 4); >> + } >> + >> + private final static char[] HEX_DIGITS = { >> + '0' , '1' , '2' , '3' , '4' , '5' , >> + '6' , '7' , '8' , '9' , 'a' , 'b' , >> + 'c' , 'd' , 'e' , 'f' >> + }; >> + >> + private static void toUnsignedString(char[] dest, int offset, int len, >> long i, int shift) { >> + int charPos = len; >> + int radix = 1 << shift; >> + long mask = radix - 1; >> + do { >> + dest[offset + --charPos] = HEX_DIGITS[(int)(i & mask)]; >> + i >>>= shift; >> + } while (i != 0 && charPos > 0); >> + } >> + >> /** >> * Returns a hash code for this {@code UUID}. >> * >