Just a note, I was trying the cluster command and was short on space. I figured I had enough space for the new table and index. It failed on me twice. The reason is that I noticed for the command to complete, it needed the space of the new table and 2x the space of the new index. It looks like it creates the new table, then a new index. Afterwards it looked like it creates another index in the DB pgsql_tmp. So for me this is an important consideration, since the new index size was about 7GB. I had not anticipated the second index size so that is why it failed. I ended up creating a link of pgsql_tmp to another parttion to successfully complete.
Dan. -----Original Message----- From: Bill Moran [mailto:[EMAIL PROTECTED] Sent: Thursday, April 15, 2004 4:14 PM To: Shea,Dan [CIS] Cc: Postgres Performance Subject: Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already Shea,Dan [CIS] wrote: > Bill, if you had alot of updates and deletions and wanted to optimize your > table, can you just issue the cluster command. > Will the cluster command rewrite the table without the obsolete data that a > vacuum flags or do you need to issue a vacuum first? From the reference docs: "During the cluster operation, a temporary copy of the table is created that contains the table data in the index order. Temporary copies of each index on the table are created as well. Therefore, you need free space on disk at least equal to the sum of the table size and the index sizes. "CLUSTER preserves GRANT, inheritance, index, foreign key, and other ancillary information about the table. "Because the optimizer records statistics about the ordering of tables, it is advisable to run ANALYZE on the newly clustered table. Otherwise, the optimizer may make poor choices of query plans." The primary reason CLUSTER exists is to allow you to physically reorder a table based on a key. This should provide a performance improvement if data with the same key is accessed all at once. (i.e. if you do "SELECT * FROM table WHERE key=5" and it returns 100 rows, those 100 rows are guaranteed to be all on the same part of the disk after CLUSTER, thus a performance improvement should result.) Updates and inserts will add data in the next available space in a table with no regard for any keys, and _may_ require running all over the disk to retrieve the data in the previous example query. I doubt if CLUSTER is an end-all optimization tool. The specific reason I suggested it was because the original poster was asking for an easier way to drop/recreate a table (as prior experimentation had shown this to improve performance) I can't think of anything easier than "CLUSTER <tablename> ON <keyname>" Since CLUSTER recreates the table, it implicitly removes the dead tuples. However, it's going to be a LOT slower than vacuum, so if dead tuples are the main problem, vacuum is still the way to go. -- Bill Moran Potential Technologies http://www.potentialtech.com ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])