[sqlite] AUTOINC vs. UUIDs

Keith Medcalf Wed, 20 May 2015 22:22:55 -0600

Fossil does not use UUID's.

Artifact IDs used by fossil are the SHA-1 hash of the file contents, and the 
checkin IDs are the SHA-1 hash of the check-in manifest contents.  They are 
*NOT* random but rather, are 100% deterministic -- that is if you run the sha-1 
hash over the same input data you will ALWAYS get the same result.

Whether or not the "definition" of a UUID includes a hash function or not does 
not make it deterministic.  The application of a hash function cannot increase 
the entropy of the underlying random data, nor can it make its "Universal" 
designation more than a Hope and Prayer.  I will, however, grant that in the 
case where the UUID is generated from a good hash function applied against a 
combination of local unique identity and random data (such as the FQDN of the 
machine, the current timestamp, and the output of a good local whirlpool of 
entropy), *and* it is verified to be locally unique, then it is *more likely* 
to be Universally unique than if it is based directly (or by application of a 
hash function) against a purely random source.

Using a UUID as an prayerful means of generating unique identifiers is 
ill-advised.  If you want a generated nonsensical key then apply a hash 
function over the real record key and use that.  Triggers and Referential 
Integrity constraints can ensure that the generated key is maintained in sync 
with the changes to the key fields.  Then and only then will you be able to 
merge or update data from multiple distributed databases into a master.

> -----Original Message-----
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
> bounces at mailinglists.sqlite.org] On Behalf Of Scott Robison
> Sent: Wednesday, 20 May, 2015 21:42
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] AUTOINC vs. UUIDs
> 
> On Wed, May 20, 2015 at 7:20 PM, R.Smith <rsmith at rsweb.co.za> wrote:
> 
> >
> > On 2015-05-21 01:52 AM, Peter Aronson wrote:
> >
> >> Now you're just getting silly.  What if the application sets all
> rowids,
> >> everywhere to 1?  The fact is, the chance of collision on a UUID is
> pretty
> >> astronomically low as long as a decent source of entropy is used (see
> >> http://en.wikipedia.org....
> >>
> >
> > I think Keith's point (which I very much agree with) is that
> > astronomically big is still not guaranteed - and ANY solution that
> relies
> > on something not guaranteed is a bad solution. I'd much rather even
> ensure
> > that similar ID's are used client-side, then KNOW that that is the case
> and
> > implement a solution that understands this and deals with it (such as
> > simply prepending a device-specific ID or some such) to ensure 100%
> secure
> > uniqueness server-side - no need to rely on astronomically big
> > randomnessessess.
> >
> 
> Then I guess all the distributed version control systems that rely on
> unique hash values (including fossil, the sqlite DVCS) are a bad solution.
> Note that two of the five defined standards for UUID are based on hashes.
> Okay, so modern hashes are longer than 128 bits, but that only reduces the
> probability of a collision, it does not eliminate it.
> 
> "But that's not used as the primary key of a SQLite or other relational
> table" you might say. Except it is the unique key for a conceptual table.
> 
> Certainly I do not agree with the originally linked article that integer
> primary keys should almost always be avoided, and there was a lot of
> exaggerating in the risks involved. Still, UUIDs (or other similarly long
> or longer hash or quality random number source based IDs) can be an
> effective technique when used appropriately.
> 
> --
> Scott Robison
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] AUTOINC vs. UUIDs

Reply via email to