On Thu, 2004-10-21 at 21:14, Joshua Marsh wrote: > Hello everyone, > > I am currently working on a data project that uses PostgreSQL > extensively to store, manage and maintain the data. We haven't had > any problems regarding database size until recently. The three major > tables we use never get bigger than 10 million records. With this > size, we can do things like storing the indexes or even the tables in > memory to allow faster access. > > Recently, we have found customers who are wanting to use our service > with data files between 100 million and 300 million records. At that > size, each of the three major tables will hold between 150 million and > 700 million records. At this size, I can't expect it to run queries > in 10-15 seconds (what we can do with 10 million records), but would > prefer to keep them all under a minute. > > We did some original testing and with a server with 8GB or RAM and > found we can do operations on data file up to 50 million fairly well, > but performance drop dramatically after that. Does anyone have any > suggestions on a good way to improve performance for these extra large > tables? Things that have come to mind are Replication and Beowulf > clusters, but from what I have recently studied, these don't do so wel > with singular processes. We will have parallel process running, but > it's more important that the speed of each process be faster than > several parallel processes at once.
I'd assume that what's happening is that up to a certain data set size, it all fits in memory, and you're going from CPU/memory bandwidth limited to I/O limited. If this is the case, then a faster storage subsystem is the only real answer. If the database is mostly read, then a large RAID5 or RAID 1+0 array should help quite a bit. You might wanna post some explain analyze of the queries that are going slower at some point in size, along with schema for those tables etc... ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org