On 21/05/2015 09:33, J Decker wrote:
> chance of collision is slim... it's a truly large amount of bits...
> There's libuuid for linux, and builtin functions for windows to create them.

The chance of collision is only slim if you (as some people have already
indicated) have a proper entropy source to generate type 4s. In any other case
the chance of collision is pretty much guaranteed, and not just once, but for
each PK entry because of how PRNGs work.

The other main concern that seems to be the case here is merging data from
separate DBs with the same PK. Of course, if you are dealing with multiple
clients with individual input in local DBs, you would not just take the DBs
and slap them one on top of the other on the central storage location; that is
just being extremely naive.

I also wouldn't want to be the poor server having to check all UUIDs for
matching entries on every delta if those are used, which, because of them
being pseudo-random is very much a requirement.

> Oh and the other feature! of UUIDs is merging databases... if I have a
> database deployed to 15 customers, each with their own ORGID I can easily
> throw them in the same database without worrying about fixing up primary
> keys and all records related to those keys.

You wouldn't with sequential PKs either, but instead of stuffing 2 types of
data into a single field (which I think is also a bad idea) you simply have an
extra field with ORGID. The PK should always be unique to the target DB,
unrelated to the source DBs, when merging. Blurring those lines is asking for
trouble, IMHO.

So, I think the following can be said:
* UUIDs are not human-readable in case someone actually needs to remember a PK
* If you stuff an ORGID into the UUID, you're actually making it easier to
leak data; only a truly random UUID would make sense.
* UUIDs offer no key space advantage to a DB that is actually relevant to even
large DB systems (let alone normal SQLite use)
* UUIDs are not offering guarantees of uniqueness which a one-way
auto-increment would, requiring extensive checking "just in case"
(contributing to the next point)
* UUIDs depend on the client-side implementation of a PRNG and entropy source,
which may be flawed, produce repeatable results, etc.
* UUIDs take more resources to use (memory, CPU cycles for comp, hash, etc.)
which I think is right against what you'd want for SQLite

As a result, I don't think it's wise to use them, objectively speaking. It has
very little to do with having used them in the past myself or not (OT: that is
actually a psychological trap employed by many shady business models ;) ).
I do hope that if you use them, you never run into collisions and having to
sort out the pieces afterwards.

M.

Reply via email to