On Fri, Jun 25, 2010 at 7:39 AM, P Kishor <punk.k...@gmail.com> wrote:
> Thanks Cory.
>
> On Fri, Jun 25, 2010 at 9:07 AM, Cory Nelson <phro...@gmail.com> wrote:
>> On Fri, Jun 25, 2010 at 6:49 AM, P Kishor <punk.k...@gmail.com> wrote:
>>> Is there any gotcha, any disadvantage (query complexity, db size,
>>> query speed) to using a composite PK (two columns) vs. a single
>>> AUTOINCREMENT INT?
>>>
>>> Background: I happen to have the two columns in question in my table
>>> anyway. Adding an INTEGER PRIMARY KEY would use up space I don't want
>>> to use. My db is big enough to worry about space from a single field.
>>
>> Primary keys that are not a single integer are equivalent in storage
>> and complexity to a separate additional index, and composite keys
>> don't have any penalty over single-column keys.
>>
>> Single integer primary keys you get for free.  They take no extra
>> storage or complexity because in SQLite every table already has one
>> even if you don't use it (called a "rowid").  Specifying one merely
>> gives it a new name.
>>
>
>
> My question was from a more generic db perspective although I asked it
> here on sqlite list. Specifically, in Pg, there is no concept of
> 'rowid'. Even though there is an 'oid', it can be turned off, and is
> anyway recommended to not be depended upon to act as a PK. Even more
> so, it ('oid' in Pg), is an INT4, so it is upper-limit bound. So, let
> me rephrase my question a bit --

A lot of things vary between databases, and some even support multiple
storage backends (b+tree, hash table, etc.) for various things.  I
don't think you'll be able to find any answers that work across
multiple databases beyond "yes that SQL will work with a conformant
implementation".

> Are composite PKs made up of two INT columns the same in query speed,
> complexity and storage as a single INT PK that is not an alias of a
> system row id?

Short answer:

As SQLite has shown, it depends on the database.  You'll probably
either find databases to have no overhead, or the equivalent overhead
of an index.

> What I glean from your answer --
>
> PKs that are not a single INT, but a composite of, say, two INT
> columns, "add the equivalent of an additional index." Is that what you
> are saying? An index in addition to what might be created to treat
> them as a PK?
>
> And, "composite keys don't have any penalty over single-column keys."
> In other words, a composite PK made up of two INT columns is just as
> quick and svelte as a single INT column PK. Is that what you are
> saying?
>

Longer answer:

SQLite uses a b+tree for storage.  Tables are all b+trees keyed by an
integer (the "rowid").  Tables store all your data.  Indexes create a
separate b+tree keyed by the indexed columns.  Indexes store only the
indexed columns and a rowid to the table row it pertains to.  So when
you find something in the index, it needs to read the rowid from the
index and then use that to find the row in the table.  That's the
extra overhead (in terms of space and I/Os) that an index causes.  As
far as indexes are concerned, single-column is not any more or less
efficient than multi-column.

Because SQLite always uses an integer as the key for tables, it maps
well to an integer primary key.  Anything else and it needs to create
one of those separate b+trees, so any non-integer primary key will
have additional overhead.  Other databases might avoid creating a
separate b+tree, or they might not.  There's is nothing in the SQL
standards that dictate how they implement this, and there's no general
rule to follow -- you'll just have to look in their documentation to
find out.

-- 
Cory Nelson
http://int64.org
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to