Re: [sqlite] SQLITE bug

R Smith Sun, 03 Sep 2017 11:04:18 -0700


On 2017/09/03 4:16 PM, Joseph L. Casale wrote:

-----Original Message-----
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On
Behalf Of R Smith
Sent: Sunday, September 3, 2017 7:51 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] SQLITE bug

Lastly, a comment I've made possibly more than once on this list: There
is no imperative to trust the SQL engine with ID assignments. You are
free to (and I prefer to) assign IDs yourself.

What exactly do you feel you benefit by taking ownership of the ID, specifically
that of which you feel supersedes the obvious perils in the cases you noted?


Several things, I'll restricted this to the top three because TLDR.

First and foremost, precise control. I can decide on a per-system basiswhich IDs will be available. Now the use-case in SQLite terms is obvioussince every SQLite database is a unique system unto itself (no centralserver) and I can control which IDs (or ID blocks) will be used. The usecase for data restoration or synchronization between keyed DBs isobvious too, so I'll concentrate on the less obvious things. For one, Ican rest assured no tampering will break the systems when the PKs arecontrolled (like some naughty software/person/bug adjusting thesqlite3_sequence table or just adding a row that has an ID near the64bit limit).

For a more non-SQLite reason, I can control distributed systemsspecifying which ID-blocks are assigned by which system as a way tocontrol system-group-wide unique key IDs without resorting to GUIDs andthe like[1]. I can also check for non-DB based corruption (the DBmechanism is fine but the data isn't working like it should because ofbugs, tampering or other faults) based on violation of the predefinedKey assignment ranges - at least as one of the checks.

The next item pertains to all SQL DBs. Lower linking complexity - thisis the most important, but a bit hard to explain. If you have multiplelinked tables (as one often does) then I can control the linked IDsbetween them. A small example from a typical use case we have: We storeAddresses in a table (A single company might have more than one site andaddress), and to every address is a linked row in another table with GPSinformation (Lat, Long, google maps api links, etc.) and yet anothertable holds cached maps image blobs and the like - so sometimes an ID isused for an address which do not have any cached maps, but the nextaddress ID added will add the same ID for the GPS table and the Mapstable (often skipping a PK ID or two to achieve this). Also, later I canadd cached maps for any skipped item by simply inserting with its same ID.

The first thing to note about the above is that I can simply say inpseudo code:

  i = calc_new_id();
  INSERT INTO Addresses(ID, V1, V2, ...) VALUES (:i, :P1, :P2, ....);
  INSERT INTO Maps(ID, V1, V2, ...) VALUES (:i, :P1, :P2, ....);
  INSERT INTO GPS(ID, V1, V2, ...) VALUES (:i, :P1, :P2, ....);

whereas in a AUTOINCREMENT based conventional approach might end upsomething like:

  INSERT INTO Address(ID, V1, V2, ...) VALUES (NULL, :P1, :P2, ....);
  i = getLastInsertID();

INSERT INTO Maps(ID, KeyToAddress, V1, V2 ...) VALUES (NULL, :i, :P1,:P2, ....);

  m = getLastInsertID();

INSERT INTO GPS(ID, KeyToAddress, KeyToMap, V1, V2 ...) VALUES (NULL,:i, :m, :P1, :P2, ....);


with some possible variation depending on your needs.

Debugging systems like the first example above is much easier in humanterms since, when manually cross-checking, I don't have to remember 3different IDs... The Address at ID 177 has Map data at ID 177 in theMaps table and GPS data at ID 177 in the GPS table... you see thepattern easily. It also obviates the need for an additional FK column inthe subsequent tables to "map" to the parent address, since the PK initself /IS/ the FK to the parent. (I might still add FKs to gaincascading functionality or where the relation is one-to-many).

You control EVERY other piece of data you push into the DB, why not theKey too? Or put differently, why would you rather do in principle:

  INSERT Stuff;
  Get the Key for it;
  Use the key to insert more stuff or use elsewhere if needed; [2]

Than:
  Make the Key;
  Use the key to insert stuff or use elsewhere if needed;

I know the "Make the Key" step might be a little bit of effort for quicklittle DBs, so I too use the auto-increment for them, but for everythingsubstantial, the amount of coding to do this pales in comparison withthe amount of code you write to do normal system checking and testing,AND, using your own specified keys can often save some code on thechecking side.

[Let me admit here that in many non-intrinsic applications my"calc_new_id();" function often resolves to simply "SELECT MAX(ID)+1FROM..." and a validation check, often no more than a 2-line function,but in other systems it may get as complex as contacting a centralserver to gain/verify the next pool of insert IDs for the local systemto use.]

Lastly - One small gripe I have with auto-increment is that it lullssystem programmers into wanting to make INT PK's for everything, evenstuff where that is clearly the wrong/unneeded approach. An Order-entrysystem should have the full Order-Number as the PK, not an INT. I oftensee people making a table like this and just out of habit throwing anINT PRIMARY KEY AUTOINCREMENT in there with the next column being the:OrderNo TEXT COLLATE NOCASE UNIQUE - I ask you: if it walks like a PK,and talks like a PK... isn't it the real PK?.

There is nothing inherently relational about integers, it's a computersourced convention used more for its ordinal properties than relationalproperties. Nobody talks about person 44's children.. It's John'schildren. Of course it has merit on the point of storing the referringINT in many other tables being more efficient than including the entireTEXT value, and in some systems the lookup is faster (in MSSQLdefinitely, but I have not tested it in SQLite). This is however anever-dwindling advantage.


Oh yes, and another reason: My OCD.  :)

Now just to be clear: I am not advocating the banning of AUTOINCREMENT,just trying to point out that it's a hammer graciously provided by DBengines, but no rule requires you to use it, and some jobs are not nails- especially if you have an expectation of what the ID should be (likethe OP did).



Cheers,
Ryan

[1] I do actually use and suggest to use correctly calculated GUIDs forsystem-wide Unique references which avoids ID blocks or pools, but, aGUID is the most human-unfriendly ID possible and takes more space thanan INT, so I only resort to it when the scale of the system justifies,but that's another debate and you can find discussion threads on it inthis very list.

[2] If you do not care at all what the Key is (non-linked tables whereyou won't use the Key for anything else), then Auto-Increment is justdandy, of course.



_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLITE bug

Reply via email to