On Dec 4, 2012, at 4:46 PM, Wolfgang Keller wrote:

>>>> Can I use the Guid as primary key? I am newbie to sql and mysql
>>>> management. 
>>> 
>>> Using such auto-generated surrogate keys is always a really bad idea
>>> and the straightest and shortest way to data inconsistency hell
>>> (especially through duplicates).
>> 
>> you've seen two guids generate as duplicates ?
> 
> You missed my point - *exactly*. ;-)
> 
> The very point of "natural" (as opposed to "surrogate") keys is that any
> real world "thing" (such as e.g. a person) represented by a database
> entity must have exactly one *and only one* matching record in the
> corresponding database table.
> 
> Auto-generated "GUID"s are exactly what does *not* prevent the
> generation of several records for one single real-world "thing". By
> simple multiple manual entry of the same data several times due to
> operator error. Identifying a "thing" by a natural identifier, i.e. one
> that is actually unique for each and every "thing" *in the real-world*
> does prevent such inconsistency.
> 
> I once came across an article describing one of the probably worst
> cases of database havoc caused by surrogate keys in computer history -
> the EMR system of the US veterans health administration. According to
> this article, they face the problem today that each and every individual
> veteran has an arbitrary, unknown number of corresponding database
> records - and there's no way to reconcile this mess due to the sheer
> volume and various technical issues such as typos etc.

oh, well this is the "natural keys are better" argument.  Unfortunately, 
surrogate integer PKs are necessary in most real-world installations as they 
provide far better performance than a typical natural key, when you consider 
that they are also copied out to all the referencing FK columns and are present 
in many indexes.   Even though SQLA totally supports natural keys as well as it 
can, I don't generally use natural PKs in my real-world contracts.  The DBAs I 
work with won't allow them.

I'd say any system that actually has any human being *manually entering* a 
surrogate PK value as part of the application's user interface is intrinsically 
broken.   Especially a GUID value. 

I will often add a UNIQUE constraint to the "natural" PK of the table, next to 
the surrogate PK, to avoid the data duplication issues you refer to.    I've 
used surrogate PKs for decades without data duplication issues.

That said, I did have a bad experience with GUIDs, which were in fact "natural" 
guids that were generated deterministically from other elements of the data - 
which was that the performance suffered terribly, specifically because of all 
that heavy GUID data copied out to all the referencing FKs and the associated 
indexes.




-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Reply via email to