I have a table which currently has about 500 million rows.  For the most part, 
the situation is going to be that I will import a few hundred million more rows 
from text files once every few months but otherwise there won't be any insert, 
update or delete queries.  I have created five indexes, some of them 
multi-column, which make a tremendous difference in performance for the 
statistical queries which I need to run frequently (seconds versus hours.)  
When adding data to the table, however, I have found that it is much faster to 
drop all the indexes, copy the data to the table and then create the indexes 
again (hours versus days.)  So, my question is whether this is really the best 
way.  Should I write a script which drops all the indexes, copies the data and 
then recreates the indexes or is there a better way to do this?  

There are also rare cases where I might want to make a correction.  For 
example, one of the columns is sample name which is a foreign key to a samples 
table defined with " ON UPDATE CASCADE."  I decided to change a sample name in 
the samples table which should affect about 20 million rows out of the 
previously mentioned 500 million.  That query has now been running for five 
days and isn't finished yet.  

-Aram

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to