[sqlite] AUTOINC vs. UUIDs

Stephen Chrzanowski Fri, 22 May 2015 00:14:37 -0400

{{I just got a bounced message.. Reposting}}

I've been watching this thread from the beginning with great interest, and
I still don't see the difference between using a UUID or an auto-inc
integer as a PK at the very raw, basic level. The database will only see
them as a string of bits or bytes and handle accordingly.  IMO, using UUID
is an extra overhead for humans to deal with, which is going to cause more
grief than necessary.

The subject of whether to use UUIDs or not is going to be directly related
to what the incentive is to use UUIDs is.  If the source of the information
to be controlled is in one location on one server and manipulated by only
one "server", then using UUIDs seems to be a bit ridiculous from a human
reading standpoint.  To a computer and to a database engine, 256-bits is
256-bits no matter what way you slice it.  As far as the database is
concerned, whether you're using UUIDs or auto-inc integers, it is still
going to use exactly the same amount of bytes on a drive.  So from a
storage logistics perspective, either work. From a performance issue, UUIDs
are going to slow things down (At a microsecond level) when your database
gets large.  I'm probably out of tune with how indexes actually work and
written to the database file, but, I'm sure after a few million records,
fragmentation is going to start coming up where the DB engine is going to
do more in-memory sorting to deal with large indexes and the holes UUIDs
introduce, instead of being able to sequentially read the data and find the
results it needs.

Now, when you get into the discussion of disconnected datasets, in that,
client machines can take data from a server, add, delete, and/or modify the
contents it took, you're going to need some methodology of tracking the
changes when the client wants to put it back.  The theory behind UUIDs is
excellent, but in the respect of disconnected datasets, you have to ask
which machine is the winner when updating the same "record"?  A UUID isn't
going to come to the rescue when two clients get the same data, do
different work offline, then come back and upload to the server at
different times.  You'll obviously run into more collisions of data in this
regard than you'll get via a UUID infraction or a PK infraction, regardless
of which method of PK implementation is used.

And my FINAL thought on this is, if you're going to migrate an entire
database infrastructure to UUIDs "Just because", you need to rethink the
gains and losses against doing that from a human readability standpoint.
UUIDs absolutely do have their place, but for general use, and when in the
situation in which a single manager of the data is being used (As in
clients cannot manage data offline) then UUIDs only offer greater chances
of collision compared to auto-inc fields.

On Wed, May 20, 2015 at 2:05 PM, Simon Slavin <slavins at bigfraud.org> wrote:

> Posting this not because I agree with it but because the subject has come
> up here a couple of times.
>
> <
> https://www.clever-cloud.com/blog/engineering/2015/05/20/Why-Auto-Increment-Is-A-Terrible-Idea/
> >
>
> "Today, I?ll talk about why we stopped using serial integers for our
> primary keys, and why we?re now extensively using Universally Unique IDs
> (or UUIDs) almost everywhere."
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

[sqlite] AUTOINC vs. UUIDs

Reply via email to