Re: [PR] correct formatting of UUID v7 [couchdb]

via GitHub Mon, 13 Oct 2025 11:23:23 -0700


nickva commented on PR #5698:
URL: https://github.com/apache/couchdb/pull/5698#issuecomment-3398599747

> But the doc id was hex-encoded in the original, so not the '128-bit binary
value'.

I am using the raw binary value (128-bit, 16 bytes) in the new purge
optimization PR directly, that's used for the UUID not the DocID. The raw 128
bit (16 bytes) representation is the shortest so we cut the ID size in half,
which helps with storing lots of them, b-tree sizes etc. We don't emit them in
the API or accept them as input. However json doesn't pass raw binary through
so we can't return those in `_uuid` results. We'd have to base64, base32 or
base16 encode them. We do that with the other UUID types -- `random` uuid is a
16 byte binary, which we hex encode. But we could have also base32 encode it to
keep it even shorter, for example.

Some UUID types have standard string representation formats with hex +
dashes between some parts. That's best in general but the RFC recommends using
binaries (or in our case encoded binaries) when feasible.

I can see users wanting both, so we could have it configurable, but maybe
make uuid v7 hex without dashes the default, for compatibility. Users may
check/assert the length of these of these IDs somewhere. We do something of
that sort with revisions - if they look like 32 byte string we turn them into
binaries:
https://github.com/apache/couchdb/blob/16ced957924ff3cac20c3c9c3dd91d9a1d0ce7fc/src/couch/src/couch_doc.erl#L191-L193

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] correct formatting of UUID v7 [couchdb]

Reply via email to