Re: [PERFORM] Config review
On Tue, Dec 07, 2004 at 10:07:54AM -0500, Tom Lane wrote: > If you are using a RAID configuration it might just be that you need > to adjust the configuration (IIRC, there are some RAID setups that > are not very write-friendly). Otherwise you may have little alternative > but to buy faster disks. It might be that altering the Clariion array from RAID 5 to RAID 1+0 would make a difference; but I'd be very surprised to learn that you could get that array to go a whole lot faster. One thing that might also be worth investigating is whether performance actually goes up by moveing the WAL into the array. We've had some remarkably good experiences with our recently-acquired EMC. A -- Andrew Sullivan | [EMAIL PROTECTED] When my information changes, I alter my conclusions. What do you do sir? --attr. John Maynard Keynes ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Config review
Greg Stark <[EMAIL PROTECTED]> writes: > Bryan Vest <[EMAIL PROTECTED]> writes: >> Currently the database is at a size of 87.6Gig. A Vacuum Analyze runs every >> night and has been taking 4 or 5 hours to complete. Everything seems to run >> fine for a while, then at some point the load goes through the roof and the >> iowait % also goes way up. It will recover after a little bit and then do the >> same thing all over again. > While others have pointed out problems with the config I don't think any of > them explains this irregular behaviour. As Greg says, it might be checkpoints or a background query. If it actually is the vacuum itself causing the variation in load, the theory that comes to mind is that the performance tanks when the vacuum run switches from find-dead-tuples to clean-indexes mode; clean-indexes is usually a lot more I/O intensive. ISTM it actually doesn't matter much which of these explanations is correct, because all three imply the same thing: not enough disk I/O bandwidth. The disk is near saturation already and any increase in demand drives response time over the knee of the curve. If you are using a RAID configuration it might just be that you need to adjust the configuration (IIRC, there are some RAID setups that are not very write-friendly). Otherwise you may have little alternative but to buy faster disks. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Config review
Bryan Vest <[EMAIL PROTECTED]> writes: > Currently the database is at a size of 87.6Gig. A Vacuum Analyze runs every > night and has been taking 4 or 5 hours to complete. Everything seems to run > fine for a while, then at some point the load goes through the roof and the > iowait % also goes way up. It will recover after a little bit and then do the > same thing all over again. When this happens access to the web based user > interface slows way down for our customers. Any input for improvements to this > config would be appreciated, Thanks. While others have pointed out problems with the config I don't think any of them explains this irregular behaviour. From what you're describing the response time is ok most of the time except for these particular bursts? Do they occur at regular intervals? Is it possible it's just the checkpointing? Can you see which volumes the i/o traffic is on? Is it on the local transaction log files or is it on the data files? Does the write i/o spike upwards or is it just a storm of read i/o? Also, incidentally, Is it possible you have a cron job running vacuum and don't realize it? If it happens at irregular intervals then it could be a single bad query that's causing the problem. One bad query would cause a sequential scan of your 87G and potentially push out a lot of data from the cache. I imagine this might also be especially bad with the shared_buffers being out of whack. You might start by checking the easiest thing first, set log_min_duration_statement to something high and slowly lower it until it's printing a handful of queries during the heaviest period. You could also look for a pgsql_tmp directory that indicate a disk sort is happening, which would mean some query is trying to sort a lot of data. You might have to lower sort_mem to a conservative value before you could see that though. The pgsql_tmp directory appears (and disappears?) as needed, it's something like this: bash-2.05b# ls /var/lib/postgres/data/base/17150/pgsql_tmp pgsql_tmp22184.0 -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])