Re: [sqlite] BLOBs and NULLs

Drago, William @ MWG - NARDAEAST Tue, 22 Apr 2014 08:35:34 -0700

>I myself prefer create table foo (guid blob primary key [NOT NULL], ...).

If a genuine GUID looks like this: 37af1247-2e77-4880-8f46-48803ae2cd0a, then 
why blob and not text?

-Bill

-----Original Message-----
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Dominique Devienne
Sent: Tuesday, April 22, 2014 5:07 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] BLOBs and NULLs

On Mon, Apr 21, 2014 at 4:58 PM, James K. Lowden <jklow...@schemamania.org> 
wrote:
> On Mon, 21 Apr 2014 13:30:15 +0000
> "Drago, William @ MWG - NARDAEAST" <william.dr...@l-3com.com> wrote:
>
>> Should I split this table up into smaller tables to eliminate the
>> NULLs (e.g. use one table each for IL, Phase, RL, Isolation)?

> Your database design is a model of the real world.  The rules it
> enforces should reflect those of the world it models. The tuple (IL,
> Phase, RL, Isolation) doesn't have much meaning, does it, in the sense
> that *together* they say something special about whatever (GUID, Path)
> represent?  From your description, each individual tuple (e.g., (GUID,
> Path, IL)) is meaningful, but the presence of, say, IL without Phase
> is not.  Each is a separate, freestanding fact, justifying its own table.

Pushed to its extreme, this sounds like an EAV* model, which seems surprising 
coming from you James ;)

* http://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model

> The part I like least about your design (1) the use of a GUID and, in
> particular (2) the name "GUID" for the column.  If you are generating
> this GUID, don't; use an integer.  If you're not generating it -- if
> it comes to you from another source and therefore identifies something
> in "the real world" in some sense, OK.  Either way, use the name of
> the column to reflect the thing identified, not the datatype of the
> identifier.

First, about your GUID versus integer point (an often contentious debate), I'm 
on the (opposite) GUID side myself. And the reason is to increase concurrency 
and decrease contention. When you depend on a server-enforced AUTO-INCREMENTED 
key, you must go to the central server to get that unique id, which is a 
round-trip (in client-server scenarii), and sometimes you must also insert a 
row, before you potentially have all the facts (columns). A GUID on the other 
hand can be randomly generated client-side, completely independently of the 
server, and still be unique (since 128 bit, a very very large integer, and a 
good RNG). If you need to "merge" later tables or databases created 
independently, you won't have GUID conflicts, unlike AUTO-INCREMENTED integer 
PKs (the integers are used in FKs elsewhere, so any merge operation must map 
the old AUTO-INC PKs to the new AUTO-INC ones, and updates all FKs). So IMHO, a 
GUID PK is very appropropriate in many scenarii, as long as one remembers it i
 s only a Surrogate Key, i.e. a convenient fixed-sized, globally unique (and 
thus often immutable) single value to use in FKs (especially convenient in the 
face of complex composite natural keys), and not a Natural Key.

Second, the datatype is blob (or RAW(16) in Oracle), not GUID, and there's thus 
nothing wrong to call it what it is. Many people in the SQL community seem to 
like stuttering, and will call it foo_id or foo_uid or foo_guid for a foo 
table, but I myself prefer create table foo (guid blob primary key [NOT NULL], 
...).

It's mostly a matter of opinion and taste, but there are benefits to using 
GUIDs as SKs in some circumstances in my book. My $0.02. --DD

PS: For a long time, SQLite depended on that integer PK, whether one wanted it 
or not (and I wasn't aware saying PRIMARY KEY for a non-integer PK did not mean 
it's standard-required NOT NULL until reading a recent thread. I wish there was 
a single opt-in pragma to disable all non-standard behavior). Now there's 
WITOUT ROWID tables, but from reading this list, it seems like it forces you to 
a B-tree, instead of a B*-tree  (store rows in leafs only) like ROWID-based 
tables, which I believe means more expensive inserts via more B-tree 
rebalancing, which is especially bad with large (blob-using) rows, given SQLite 
lack of out-of-row blob storage.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER NOTE:This e-mail and any 
attachments are solely for the use of the addressee and may contain information 
that is privileged or confidential. Any disclosure, use or distribution of the 
information contained herein is prohibited. In the event this e-mail contains 
technical data within the definition of the International Traffic in Arms 
Regulations or Export Administration Regulations, it is subject to the export 
control laws of the U.S.Government. The recipient should check this e-mail and 
any attachments for the presence of viruses as L-3 does not accept any 
liability associated with the transmission of this e-mail. If you have received 
this communication in error, please notify the sender by reply e-mail and 
immediately delete this message and any attachments.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] BLOBs and NULLs

Reply via email to