On Mon, 29 Sep 2025 15:08:14 GMT, Kieran Farrell <[email protected]> wrote:
>> With the recent approval of UUIDv7 >> (https://datatracker.ietf.org/doc/rfc9562/), this PR aims to add a new >> static method UUID.timestampUUID() which constructs and returns a UUID in >> support of the new time generated UUID version. >> >> The specification requires embedding the current timestamp in milliseconds >> into the first bits 0–47. The version number in bits 48–51, bits 52–63 are >> available for sub-millisecond precision or for pseudorandom data. The >> variant is set in bits 64–65. The remaining bits 66–127 are free to use for >> more pseudorandom data or to employ a counter based approach for increased >> time percision >> (https://www.rfc-editor.org/rfc/rfc9562.html#name-uuid-version-7). >> >> The choice of implementation comes down to balancing the sensitivity level >> of being able to distingush UUIDs created below <1ms apart with performance. >> A test simulating a high-concurrency environment with 4 threads generating >> 10000 UUIDv7 values in parallel to measure the collision rate of each >> implementation (the amount of times the time based portion of the UUID was >> not unique and entries could not distinguished by time) yeilded the >> following results for each implemtation: >> >> >> - random-byte-only - 99.8% >> - higher-precision - 3.5% >> - counter-based - 0% >> >> >> Performance tests show a decrease in performance as expected with the >> counter based implementation due to the introduction of synchronization: >> >> - random-byte-only 143.487 ± 10.932 ns/op >> - higher-precision 149.651 ± 8.438 ns/op >> - counter-based 245.036 ± 2.943 ns/op >> >> The best balance here might be to employ a higher-precision implementation >> as the large increase in time sensitivity comes at a very slight performance >> cost. > > Kieran Farrell has updated the pull request incrementally with one additional > commit since the last revision: > > missing semicolon Adding support for UUID v7 also includes **sorting correctly**, IMO. This has always been incorrect in the JDK as I see it, but back in the days of UUIDv1 to v4 nobody really cared that much how a UUID would sort. Enter UUID v7 and sorting is now important to get right. So what is the problem? The existing `UUIID.compareTo()` method compares the two longs (nothing wrong with that), but those longs are SIGNED and what you need would be UNSIGNED comparison. The problem was recognized years ago in [JDK-7025832](https://bugs.openjdk.org/browse/JDK-7025832) but was rejected to change it due to concerns over backward compatibility. The problem - when UUID v7 is introduced - is that it becomes apparent that the JDK does not sort the UUID in the same way as the database does or indeed any other language. Previously, this was less of a concern because there was less of reason to sort UUIDs. To be specific, what you expect - and what both the old RFC-4122 spec and the newer RFC-9562 states in their own words - is that UUIDs should be lexicographically sorted, i.e. as if by comparing two arrays of bytes (len=16) where each byte is a value 0-255 ( as opposed to a value -128 to 127). An implementation could be: public int compareToLexi(UUID val) { int mostSigBits = Long.compareUnsigned(this.mostSigBits, val.mostSigBits); return mostSigBits != 0 ? mostSigBits : Long.compareUnsigned(this.leastSigBits, val.leastSigBits); } This would be exactly equal to a method which compares byte arrays as described above. I do not suggest to change the existing `compareTo()` logic. But at the very least this legacy problem should be highlighted somewhere in the Javadoc. Addressing this, at least with a comment, would be part of a proper UUIDv7 implementation. My 2c. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25303#issuecomment-3352041251
