Excuse me for jumping in but isn't a primary key likely to be a poor choice for a clustering index? Wouldn't it be better to choose a non-unique key of some kind?
I've used clustering indexes for years with DB2 and, assuming InnoDB uses the term in the same way, a clustering index has only one effect over and above the effect of any other index: it influences the placement of new rows when they are inserted in such a way that the database manager *tries* to put a new row on the same physical page as another row with the same key value. This is potentially advantageous because it improves performance in the case where a user asks for the rows in the clustering sequence, assuming that a large majority of rows *were* actually stored in clustering sequence. To illustrate with an example, consider an ORDERS table which contains orders placed with a manufacturing company. The table is likely to have a primary key and, more often than not, the primary key will be ORDER_NUMBER. The table is also likely to have a column like ORDER_DATE, which will normally be non-unique. If I cluster the table on ORDER_NUMBER, the clustering index is only going to be useful if I write a query that displays all my orders in order number sequence. Although that is certainly something we do often enough when messing around with test tables, just to be sure the data is inserted correctly, that doesn't seem particularly likely in the real world for real reports. A report we're much more likely to do in the real world, with real production data, is to list the orders for a given day or week. In that case, that query would work very well if the table had the ORDER_DATE as the clustering index, assuming that most of the rows are actually in clustering sequence. Again, in the DB2 world, we have to make sure that we leave sufficient freespace so that new rows *can* go on the same page as rows with the same or adjacent keys and we have to reorganize our tables occasionally to get all the rows back into clustering order and re-establish the necessary margin of freespace. Does InnoDB clustering work the same way? I'm afraid I haven't had any need for clustering indexes in my project so far so I haven't had to look for the information. Rhino ----- Original Message ----- From: "Heikki Tuuri" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Saturday, June 19, 2004 6:12 AM Subject: Re: Clustered Index - Where is data inserted ? Not Answered > Paul, > > ----- Original Message ----- > From: ""Paul Chu"" <[EMAIL PROTECTED]> > Newsgroups: mailing.database.myodbc > Sent: Saturday, June 19, 2004 5:07 AM > Subject: RE: Clustered Index - Where is data inserted ? Not Answered > > > > Appreciate any help at all > > > > Thanks, Paul > > > > > > -----Original Message----- > > From: Paul Chu [mailto:[EMAIL PROTECTED] > > Sent: Friday, June 18, 2004 10:16 AM > > To: [EMAIL PROTECTED] > > Subject: Clustered Index - Where is data inserted ? > > > > Hi, > > > > Can someone explain how the data is stored in a table using a clustered > > index. > > > > Does this mean that the data is inserted in the .myd file in sorted > > index order ? > > MyISAM does not have clustered indexes. InnoDB has a clustered index on > every table, and normally it is on the PRIMARY KEY. > > > If so, how is space made to insert the new records ? > > Does this make inserting records slow because data is being inserted in > > physical sort order in the table ? > > For large tables, people usually insert new rows at the high end of the > PRIMARY KEY, since the key is usually a generated id. Splitting B-tree pages > at the index end is very efficient. In this case having a clustered index > poses no overhead. > > But if you insert records randomly on the PRIMARY KEY, then B-tree page > splits waste disk space, and that will reduce performance somewhat. > > > Thanks, Paul > > Best regards, > > Heikki Tuuri > Innobase Oy > Foreign keys, transactions, and row level locking for MySQL > InnoDB Hot Backup - a hot backup tool for InnoDB which also backs up MyISAM > tables > http://www.innodb.com/order.php > > Order MySQL technical support from https://order.mysql.com/ > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] > > -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]