Re: [sqlite] Article on AUTOINC vs. UUIDs

J Decker Thu, 30 Nov 2017 08:27:48 -0800

On Thu, Nov 30, 2017 at 7:14 AM, Wout Mertens <[email protected]>
wrote:

> The article is a bit muddled, the only real argument I could find is that
> auto-inc makes next keys easy to guess, which is information leakage, which
> means that is a potential security problem if that information were somehow
> useful.
>
> Seems to me that problem can be resolved by having an auto-inc type that
> skips a random amount on every insert? E.g. the next id is the last ID +
> randomFromRange(1, 10000).
>
> UUIDs are nice though if you don't have a natural key available and you are
> generating keys on multiple systems. Sadly, they are strings in sqlite, and
> it would be more efficient to store and compare them as their 128-bit
> representation. Is there an extension that can do that?
>
> I don't know about the collision rate; if your systems are set up in a sane
> way, the MAC address alone would prevent collisions, no? And on the same
> system, are collisions even possible?
>

virtual machines can easily duplicate mac addresses if copied and not
updated correctly.
It's also a leak of information, and if it's the server service that's
creating the UUID's using it's mac, doesn't matter that it's for lots of
clients, the odds of collision increase greatly... even if you have a small
cloud of 10's or 100's of systems.... the leak of information is why MS
moved away from mac addresses when genertaing GUIDs especially for things
like COM service IDs (that was more than a decade ago, so I'm finding it
hard to find articles on it).  There are also network device manufacturers
that relesaed hardware with duplicate mac addresses; but since they were
between multiple lots of product it wasn't noticed to a great extent.

For the project I used UUIDs extensively for, it was written in C# with
datasets with all foriegn keys modeled in it.   On insert, if there was a
collision, it would regenerate a UUID and update the offending row, which
would automaically propagate through all child rows; and since the parent
had to exist before inserting the children then continuing on from that
point was very little work for the database.  Though it only used a few
thousand IDs and after many years of usage would only be a couple hundred
thousand IDs I still coded it paranoid-like.

Some databases have preference for using Sequential UUIDs.

https://stackoverflow.com/questions/170346/what-are-the-performance-improvement-of-sequential-guid-over-standard-guid
Laboratory Test – SQL Server

VS2008 test, 10 concurrent users, no think time, benchmark process with 600
inserts in batch for leaf table
Standard Guid
Avg. Process duration: 10.5 sec
Avg. Request for second: 54.6
Avg. Resp. Time: 0.26

Sequential Guid
Avg. Process duration: 4.6 sec
Avg. Request for second: 87.1
Avg. Resp. Time: 0.12

Results on Oracle (sorry, different tool used for test) 1.327.613 insert on
a table with a Guid PK

Standard Guid, 0.02 sec. elapsed time for each insert, 2.861 sec. of CPU
time, total of 31.049 sec. elapsed

Sequential Guid, 0.00 sec. elapsed time for each insert, 1.142 sec. of CPU
time, total of 3.667 sec. elapsed

Could really wish more languages had DataSet.

I would also like to make a note, that many criticisms are 'there's so many
bytes to have to compare', however, because of the highly random nature of
good UUIDs failure occurs quickly, usually within 4 bytes, which makes it
almost as good as an integer (especialy for things like SQLite that are
comparing numbers as strings anyway).... the only time the full thing is
compared is on the row that exactly matches.

'sides storage is cheap...
https://www.linkedin.com/pulse/20140414133905-9970539-peak-hard-drive/
would project 600TB drives by 2020, but another place noted that
commercial(home consumer) hard drives weren't increasing in size as fast as
datacenter/server drive storage.  but I can't find any recent articles that
also say that... looking at graphs of such information they all stop
2010-2013 ish... so no data for the last 4-7 years.... *shrug*

> On Thu, Nov 30, 2017 at 4:01 PM Keith Medcalf <[email protected]> wrote:
>
> >
> > Well, in my opinion the guy is an idiot.  The way to avoid the issues he
> > is describing as the problems with serial IDs (or using the RowID) are
> > simple to avoid:  DO NOT DO THOSE THINGS!  If you know that disclosing
> the
> > RowID in a URL is ungood, then DO NOT DO THAT.  Sheesh.  As my good buddy
> > Buggs Bunny would say "What a maroon!".
> >
> > Nonetheless, I have created an SQLite extension for Windows (2000 or
> > later) that will generate UUIDs using the builtin Windows RPC interface.
> > Apparently similar facilities are available on other OSes though all in
> > different manners (different functions in different libraries).  Note
> that
> > the silly proxies for the RPC functions are so that the compiler can
> > maintain correct linkage to the RPC libraries when using function
> pointers
> > -- the linkage through function pointers cast to (void*) works on 64-bit
> > Windows but not on 32-bit Windows.  On 32-bit windows not using a proxy
> > function to maintain the correct linkage results in the stack frame
> > corruption.
> >
> > Also, uuid generation function for V1/3/4/5 are available in the Python
> > standard uuid library, not mentioned in the article.
> >
> > File is sqlfwin.c located in
> > http://www.dessus.com/files/SQLiteExtensions.zip for the 32-bit Windows
> > versions; or
> > http://www.dessus.com/files/SQLiteExtensions64.zip for the 64-bit
> Windows
> > versions
> > Source is the same in both, .dll compiled with MinGW 7.1.0.2
> (dependencies
> > only to standard windows DLLs and to the subsystem runtime library).
> >
> > SQLite3 UDF functions returning blobs (16-byte UUID) are:
> > uuidCreateV1()
> > uuidCreateV4()
> > uuidFromString('text-uuid-rendering')
> >
> > And returning textual renderings are:
> > uuidStringCreateV1()
> > uuidStringCreateV4()
> > uuidToString(uuid-blob)
> >
> > The create functions are volatile (like the randomblob function), and the
> > To/From string functions are deterministic.
> >
> > sqlfwin.c also contains some other Windows API functions for working with
> > the builtin windows security such as looking up names and sids, checking
> > whether the current process access token contains a given sid/name,
> getting
> > the current process access token username, computername, FQDN, and a few
> > others.
> >
> > ---
> > The fact that there's a Highway to Hell but only a Stairway to Heaven
> says
> > a lot about anticipated traffic volume.
> >
> >
> > >-----Original Message-----
> > >From: sqlite-users [mailto:sqlite-users-
> > >[email protected]] On Behalf Of Simon Slavin
> > >Sent: Thursday, 30 November, 2017 07:16
> > >To: SQLite mailing list
> > >Subject: [sqlite] Article on AUTOINC vs. UUIDs
> > >
> > >Thought some of you might enjoy seeing this article.  I make no
> > >comment on what I think of the reasoning therein.  It’s set in the
> > >PostgreSQL world, but you could make an external function for SQLite
> > >which generates UUIDs.
> > >
> > ><https://www.clever-cloud.com/blog/engineering/2015/05/20/why-auto-
> > >increment-is-a-terrible-idea/>
> > >
> > >"Today, I'll talk about why we stopped using serial integers for our
> > >primary keys, and why we're now extensively using Universally Unique
> > >IDs (or UUIDs) almost everywhere."
> > >
> > >Simon.
> > >_______________________________________________
> > >sqlite-users mailing list
> > >[email protected]
> > >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> >
> >
> > _______________________________________________
> > sqlite-users mailing list
> > [email protected]
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Article on AUTOINC vs. UUIDs

Reply via email to