Re: [PERFORM] Reading data in bulk - help?

Josh Berkus Tue, 09 Sep 2003 17:15:43 -0700

Chris,

> I've got an application that needs to chunk through ~2GB of data. The 
> data is ~7000 different sets of 300 records each. I put all of the data 
> into a postgres database but that doesn't look like its going to work 
> because of how the data lives on the disk.


Your problem is curable through 4 steps:

1) adjust your postgresql.conf to appropriate levels for memory usage.

2) if those sets of 300 are blocks in some contiguous order, then cluster them 
to force their physical ordering on disk to be the same order you want to 
read them in.   This will require you to re-cluster whenever you change a 
significant number of records, but from the sound of it that happens in 
batches.

3) Get better disks, preferrably a RAID array, or just very fast scsi if the 
database is small.    If you're budget-constrained, Linux software raid (or 
BSD raid) on IDE disks is cheap.  What kind of RAID depends on what else 
you'll be doing with the app; RAID 5 is better for read-only access, RAID 1+0 
is better for read-write.

4) Make sure that you aren't dumping the data to the same disk postgreSQL 
lives on!   Preferably, make sure that your swap partition is on a different 
disk/array from postgresql.   If the computing app is complex and requires 
disk reads aside from postgres data, you should make sure that it lives on 
yet another disk.   Or you can simplify this with a good, really large 
multi-channel RAID array.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Re: [PERFORM] Reading data in bulk - help?

Reply via email to