On Wed, Nov 6, 2024 at 10:14 AM Andrey M. Borodin <x4...@yandex-team.ru> wrote: > > > > > On 5 Nov 2024, at 23:56, Andrey M. Borodin <x4...@yandex-team.ru> wrote: > > > > <v30-0001-Implement-UUID-v7.patch> > > Some more thoughts on this patch version: > > 0. Comment mentioning nanoseconds, while we do not need to carry anything > /* Convert TimestampTz back and carry nanoseconds. */ > > 1. There's unnecessary &3 in > uuid->data[7] = uuid->data[7] | ((uuid->data[8] >> 6) & 3); > > 2. Currently we store 0..999 microseconds in 10 bits, so values 1000..1023 > are unused. We could use them for overflow. That would slightly increase > non-overflowing capacity when generating more than million UUIDs per second > on one backend. However, given current performance of our CSPRNG I do not > think this feature worth code complexity. >
While using only 10 bits microseconds makes the implementation simple, I'm not sure if 10 bits is enough to generate UUIDs at microsecond granularity without losing monotonicity. Since 10-bit microseconds are used as is in rand_a space, 1000 UUIDs can be generated per millisecond without losing monotonicity. For example, in my environment, it took 1808 milliseconds to generate 1 million UUIDs. This is about 533 UUIDs generated per millisecond. As UUID generation performance improves, I think 10 bits will not be enough. =# select count(uuidv7()) from generate_series(1, 1_000_000); count --------- 1000000 (1 row) Time: 1808.734 ms I found a similar comment from Sergey Prokhorenko[1]. He also mentioned: > 4) Microsecond timestamp fraction subtracts 10 bits from random data, which > increases the risk of collision. In the counter, almost all bits are > initialized with a random number, which reduces the risk of collision. I feel that it's better to switch to Method 1 or 2 with 12 bits or larger counter space. Regards, [1] https://www.postgresql.org/message-id/305478845.5279532.1712440778735%40mail.yahoo.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com