Hello everyone, I am currently working on a data project that uses PostgreSQL extensively to store, manage and maintain the data. We haven't had any problems regarding database size until recently. The three major tables we use never get bigger than 10 million records. With this size, we can do things like storing the indexes or even the tables in memory to allow faster access.
Recently, we have found customers who are wanting to use our service with data files between 100 million and 300 million records. At that size, each of the three major tables will hold between 150 million and 700 million records. At this size, I can't expect it to run queries in 10-15 seconds (what we can do with 10 million records), but would prefer to keep them all under a minute. We did some original testing and with a server with 8GB or RAM and found we can do operations on data file up to 50 million fairly well, but performance drop dramatically after that. Does anyone have any suggestions on a good way to improve performance for these extra large tables? Things that have come to mind are Replication and Beowulf clusters, but from what I have recently studied, these don't do so wel with singular processes. We will have parallel process running, but it's more important that the speed of each process be faster than several parallel processes at once. Any help would be greatly appreciated! Thanks, Joshua Marsh P.S. Off-topic, I have a few invitations to gmail. If anyone would like one, let me know. ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]