John Allgood wrote:
I think maybe I didn't explain myself well enough. At most we will service 200-250 connections across all the 9 databases mentioned. The database we are building is for a trucking company. Each of the databases represents a different division. With one master database that everything is updated to. Most of the access to the database is by simple queries. Most of the IO intensive stuff is done when revenue reports are generated and when we have our month/year end processing. All the trucking loads that are mark as delivered are transferred to our master database during night time processing. All that will be handled using custom scripts. Maybe I have given a better explanation of the application. my biggest concern is how to partition the shared storage for maximum performance. Is there a real benifit to having more that one raid5 partition or am I wasting my time.
John Allgood - ESC
If you read the general advice statements, it's actually better to not use raid5, but to use raid10 (striping and mirroring). Simply because raid5 writing is quite poor.
Also, if you have the disks, the next best improvement is to move pg_xlog onto it's own set of disks. I think that gets as much as 25% improvement by itself. pg_xlog is an append process, which must complete before the actual data gets updated, so giving it it's own set of spindles reduces seek time, and lets the log be written quickly. I think there is even some benefit to making pg_xlog be a solid state disk, as it doesn't have to be huge, but having high I/O rates can remove it as a bottleneck. (I'm not positive how large pg_xlog gets, but it is probably small compared with the total db size, and I think it can be flushed periodically as transactions are completed.)
I'm not sure what you are considering "shared storage". Are you thinking that all the machines will be mounting a remote drive for writing the DB? They should all have their own local copy (multiple masters on the same database is not supported).
I think it is possible to get better performance by having more raid systems. But it is application dependent. If you know that you have 2 tables that are being updated often and independently, then having each one on it's own raid would allow better concurrency.
But it sounds like in your app you get concurrency by having a bunch of remote databases, which then do bulk updates on the master database. I think if you are careful to have each of the remote dbs update the master at a slightly different time, you could probably get very good transaction rates.
Description: OpenPGP digital signature