Based on my experience as a DBA, I suspect there are penalties, but you are 
only likely to see them under load.

All data returned from databases passes through RAM.  All databases enhance 
performance via caching data in RAM.  The longer a row (i.e the more bytes in a 
row), the less rows that can be cached in a given amount of RAM.  Making 
primary keys that require 32 bytes of storage per row versus, say integers 
which require 4 bytes, reduces the amount of data that can be served from cache.

Now multiply the 32 vs 4 bytes by:  a) every foreign key referencing back to 
primary key; b) every non-clustered index on the primary key; and c) every 
index on the foreign keys.  

Add it all up and there likely is detrimental impact on:  a) performance; b) 
scalability; and c) cost (for both disks and RAM).

I think this is way 9 out of 10 DBAs prefer the shortest, most compact datatype 
possible for primary keys ;-)

Yes, there are cases where uuids are nice (e.g. Elias and Matt's examples of 
"merging content from different servers" and "migrate a blog").  But having the 
larger datatype introduces a penalty that you pay 100% of the time, vs these 
less frequent cases when having uuids help out.

Just my 2ยข. 

g

P.S.  a quick google turned up the following link of a MSSQL DBA's thoughts on 
GUIDs as primary keys:
http://www.sqljunkies.com/WebLog/odds_and_ends/archive/2005/08/31/16595.aspx


----- Original Message ----- 
From: "Elias Torres" <[EMAIL PROTECTED]>
To: <roller-dev@incubator.apache.org>
Sent: Tuesday, March 14, 2006 8:46 AM
Subject: Re: why primary key is using hex instead of integer


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I don't think it imposes a performance hit and even if it did it's
neglible. Once the database has indexed the strings or the ints the
lookup cost should be the same. I think it might help if we used fixed
char(32) as opposed to varchar(40), but that's mostly a space issue
since we are not using the last 8 bytes.

- -Elias

Allen Gilliland wrote:
> I don't know about why this decision was made historically, but i would
> agree that technically this imposes a slight performance hit.  AFAIK,
> databases can do lookups on purely numeric keys a little bit faster than
> alphanumerics.
> 
> I have no idea if the difference is significant or not.
> 
> -- Allen
> 
> 
> On Tue, 2006-03-14 at 07:25, Lance Lavandowska wrote:
> 
>>IIRC it is largely historical: at one point Roller used Castor, which
>>has as it's "default" a GUID generator that creates a 30-char primary
>>key.
>>
>>Lance
>>
>>On 3/14/06, David M Johnson <[EMAIL PROTECTED]> wrote:
>>
>>>On Mar 11, 2006, at 4:45 PM, BigLiu wrote:
>>>
>>>>I want to create some customer tables to extend roller. But I have
>>>>question
>>>>regarding why the primary key is defined in hex instead of integer?
>>>>Will
>>>>this cause any performance problem?
>>>
>>>I can't really remember why I chose the hex key. I don't think it has
>>>a significant impact on performance.
>>>
>>>- Dave

Reply via email to