At 09:26 AM 11/22/2005, Guillaume Smet wrote:
Ron wrote:
If I understand your HW config correctly, all of the pg stuff is on the same RAID 10 set?

No, the system and the WAL are on a RAID 1 array and the data on their own RAID 10 array.

As has been noted many times around here, put the WAL on its own dedicated HD's. You don't want any head movement on those HD's.

As I said earlier, there's only a few writes in the database so I'm not really sure the WAL can be a limitation: IIRC, it's only used for writes isn't it?

When you reach a WAL checkpoint, pg commits WAL data to HD... ...and does almost nothing else until said commit is done.

Don't you think we should have some io wait if the database was waiting for the WAL? We _never_ have any io wait on this server but our CPUs are still 30-40% idle.
_Something_ is doing long bursts of write IO on sdb and sdb1 every 30 minutes or so according to your previous posts.

Profile your DBMS and find out what.

A typical top we have on this server is:
 15:22:39  up 24 days, 13:30,  2 users,  load average: 3.86, 3.96, 3.99
156 processes: 153 sleeping, 3 running, 0 zombie, 0 stopped
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
           total   50.6%    0.0%    4.7%   0.0%     0.6%    0.0%   43.8%
           cpu00   47.4%    0.0%    3.1%   0.3%     1.5%    0.0%   47.4%
           cpu01   43.7%    0.0%    3.7%   0.0%     0.5%    0.0%   51.8%
           cpu02   58.9%    0.0%    7.7%   0.0%     0.1%    0.0%   33.0%
           cpu03   52.5%    0.0%    4.1%   0.0%     0.1%    0.0%   43.0%
Mem:  3857224k av, 3307416k used,  549808k free,       0k shrd,   80640k buff
                   2224424k actv,  482552k in_d,   49416k in_c
Swap: 4281272k av, 10032k used, 4271240k free 2602424k cached

As you can see, we don't swap, we have free memory, we have all our data cached (our database size is 1.5 GB).

Context switch are between 10,000 and 20,000 per seconds.
That's actually a reasonably high CS rate.  Again, why?

This concept works for other tables as well. If you have tables that both want services at the same time, disk arm contention will drag performance into the floor when they are on the same HW set. Profile your HD access and put tables that want to be accessed at the same time on different HD sets. Even if you have to buy more HW to do it.

I use iostat and I can only see a little write activity and no read activity on both raid arrays.
Remember it's not just the overall amount, it's _when_and _where_ the write activity takes place. If you have almost no write activity, but whenever it happens it all happens to the same place by multiple things contending for the same HDs, your performance during that time will be poor.

Since the behavior you are describing fits that cause very well, I'd see if you can verify that's what's going on.


