Excuse me for jumping in but isn't a primary key likely to be a poor choice
for a clustering index? Wouldn't it be better to choose a non-unique key of
some kind?

I've used clustering indexes for years with DB2 and, assuming InnoDB uses
the term in the same way, a clustering index has only one effect over and
above the effect of any other index: it influences the placement of new rows
when they are inserted in such a way that the database manager *tries* to
put a new row on the same physical page as another row with the same key
value. This is potentially advantageous because it improves performance in
the case where a user asks for the rows in the clustering sequence, assuming
that a large majority of rows *were* actually stored in clustering sequence.

To illustrate with an example, consider an ORDERS table which contains
orders placed with a manufacturing company. The table is likely to have a
primary key and, more often than not, the primary key will be ORDER_NUMBER.
The table is also likely to have a column like ORDER_DATE, which will
normally be non-unique. If I cluster the table on ORDER_NUMBER, the
clustering index is only going to be useful if I write a query that displays
all my orders in order number sequence. Although that is certainly something
we do often enough when messing around with test tables, just to be sure the
data is inserted correctly, that doesn't seem particularly likely in the
real world for real reports. A report we're much more likely to do in the
real world, with real production data, is to list the orders for a given day
or week. In that case, that query would work very well if the table had the
ORDER_DATE as the clustering index, assuming that most of the rows are
actually in clustering sequence.

Again, in the DB2 world, we have to make sure that we leave sufficient
freespace so that new rows *can* go on the same page as rows with the same
or adjacent keys and we have to reorganize our tables occasionally to get
all the rows back into clustering order and re-establish the necessary
margin of freespace.

Does InnoDB clustering work the same way? I'm afraid I haven't had any need
for clustering indexes in my project so far so I haven't had to look for the
information.

Rhino

----- Original Message ----- 
From: "Heikki Tuuri" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Saturday, June 19, 2004 6:12 AM
Subject: Re: Clustered Index - Where is data inserted ? Not Answered


> Paul,
>
> ----- Original Message ----- 
> From: ""Paul Chu"" <[EMAIL PROTECTED]>
> Newsgroups: mailing.database.myodbc
> Sent: Saturday, June 19, 2004 5:07 AM
> Subject: RE: Clustered Index - Where is data inserted ? Not Answered
>
>
> > Appreciate any help at all
> >
> > Thanks, Paul
> >
> >
> > -----Original Message-----
> > From: Paul Chu [mailto:[EMAIL PROTECTED]
> > Sent: Friday, June 18, 2004 10:16 AM
> > To: [EMAIL PROTECTED]
> > Subject: Clustered Index - Where is data inserted ?
> >
> > Hi,
> >
> > Can someone explain how the data is stored in a table using a clustered
> > index.
> >
> > Does this mean that the data is inserted in the .myd file in sorted
> > index order ?
>
> MyISAM does not have clustered indexes. InnoDB has a clustered index on
> every table, and normally it is on the PRIMARY KEY.
>
> > If so, how is space made to insert the new records ?
> > Does this make inserting records slow because data is being inserted in
> > physical sort order in the table ?
>
> For large tables, people usually insert new rows at the high end of the
> PRIMARY KEY, since the key is usually a generated id. Splitting B-tree
pages
> at the index end is very efficient. In this case having a clustered index
> poses no overhead.
>
> But if you insert records randomly on the PRIMARY KEY, then B-tree page
> splits waste disk space, and that will reduce performance somewhat.
>
> > Thanks, Paul
>
> Best regards,
>
> Heikki Tuuri
> Innobase Oy
> Foreign keys, transactions, and row level locking for MySQL
> InnoDB Hot Backup - a hot backup tool for InnoDB which also backs up
MyISAM
> tables
> http://www.innodb.com/order.php
>
> Order MySQL technical support from https://order.mysql.com/
>
>
> -- 
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to