On Wed, Nov 16, 2016 at 02:51:35PM -0600, Jeffrey Mattox wrote:
> I think this discussion is about apples and oranges.  UUID stands for
> universally UNIQUE identifier, so there won't be any collisions.  It
> looks random, but it never repeats.  [...]

No, DRH is right that this depends on how good your entropy source (and,
typically, PRNG fed that entropy) is.  Nothing about putting
"universally unique" in the name makes it so -- only the details of how
the sausage is made can take care of it.

Regardless, using UUIDs to make a distributed DB is not that great.  It
does work though.  Microsoft's Active Directory (AD), for example, uses
96-bit UUID-like values to form "domain SIDs", with user, group, and
other SIDs being formed by adding a 32-bit "relative ID" to the domain
SID.  This has worked rather well for MSFT, and it has allowed the
creation of "forests" of domains and forests of forests.  I do think AD
checks SID uniqueness within each forest, and IIRC there's a way to
handle SID collisions in forests of forests.

Uniqueness checks are not too expensive when they are feasible at all.

In the AD forest case they are feasible, while in the forest of forests
case they are not.

The alternative to randomly-generated IDs would be to have a global
registry (perhaps hierarchical), not unlike DNS, or ASN.1 OID arcs, but
there is a real cost to having to have a global registry.

So in a distributed system roughly like SPARQL, or AD, say, UUIDs will
do.  You might store them as blobs to avoid having to waste space, but,
whatever.

Nico
-- 
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to