On Mon, Jan 11, 2010 at 8:50 AM, Scott Marlowe <scott.marl...@gmail.com>wrote:
> On Mon, Jan 11, 2010 at 6:44 AM, Bob Dusek <redu...@gmail.com> wrote: > > Hello, > > > > We're running Postgres 8.4.2 on Red Hat 5, on pretty hefty hardware... > > > > 4X E7420 Xeon, Four cores (for a total of 16 cores) > > 2.13 GHz, 8M Cache, 1066 Mhz FSB > > 32 Gigs of RAM > > 15 K RPM drives in striped raid > > What method of striped RAID? RAID-5? RAID-10? RAID-4? RAID-0? > RAID-0 > > > Things run fine, but when we get a lot of concurrent queries running, we > see > > a pretty good slow down. > > Definte "a lot". > > We have an application server that is processing requests. Each request consists of a combination of selects, inserts, and deletes. We actually see degredation when we get more than 40 concurrent requests. The exact number of queries executed by each request isn't known. It varies per request. But, an example request would be about 16 inserts and 113 selects. Any given request can't execute more than a single query at a time. To be more specific about the degradation, we've set the "log_min_duration_statement=200", and when we run with 40 concurrent requests, we don't see queries showing up in there. When we run with 60 concurrent requests, we start seeing queries show up, and when we run 200+ requests, we see multi-second queries. This is to be expected, to some extent, as we would expect some perfromance degradation with higher utilization. But, the hardware doesn't appear to be very busy, and that's where we're hoping for some help. We want to have Postgres eating up as many resources as possible to chug through our queries faster. Right now, it's running slower with more utilization, but there's still too much idle time for the CPUs. > We don't have much experience with this sort of hardware. Does anyone > have > > an example config file we could use as a good starting point for this > sort > > of hardware? > > > > We have a fair amount of row-level inserts and deletes going on (probably > as > > much data is inserted and deleted in a day than permanently resides in > the > > db). > > What do the following commands tell you? > > iostat -x 10 (first iteration doesn't count) > Here's some iostat output (from the 3rd data point from iostat -x 10)... this was taken while we were processing 256 simultaneous requests. avg-cpu: %user %nice %system %iowait %steal %idle 34.29 0.00 7.09 0.03 0.00 58.58 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 112.20 0.00 133.40 0.00 1964.80 14.73 0.42 3.17 0.04 0.48 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda5 0.00 112.20 0.00 133.40 0.00 1964.80 14.73 0.42 3.17 0.04 0.48 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 0.00 0.40 0.00 3.20 8.00 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 99.90 0.00 799.20 8.00 0.15 1.50 0.01 0.12 dm-3 0.00 0.00 0.00 0.60 0.00 4.80 8.00 0.00 0.33 0.17 0.01 dm-4 0.00 0.00 0.00 144.70 0.00 1157.60 8.00 0.46 3.17 0.02 0.35 dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 The iowait seems pretty low, doesn't it? vmstat 10 (again, first iteration doesn't count) > Here's the vmstat output, with the 6th data element clipped out, which seems representative of the whole... (also taken during 256 simultaneous request) [r...@ecpe1 pg_log]# vmstat 10 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 8 0 0 21818136 696692 3871172 0 0 0 1042 2261 48418 34 7 59 0 0 That's a lot of free mem, which is to be expected. Our database is not very large. top > (taken during 256 simultaneous requests) top - 10:11:43 up 12 days, 20:48, 6 users, load average: 10.48, 4.16, 2.83 Tasks: 798 total, 8 running, 790 sleeping, 0 stopped, 0 zombie Cpu0 : 33.3%us, 7.6%sy, 0.0%ni, 59.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 31.6%us, 5.9%sy, 0.0%ni, 62.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 33.0%us, 6.6%sy, 0.0%ni, 60.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 35.4%us, 6.2%sy, 0.0%ni, 58.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu4 : 36.3%us, 5.6%sy, 0.0%ni, 58.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 37.4%us, 6.2%sy, 0.0%ni, 56.1%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu6 : 38.1%us, 6.0%sy, 0.0%ni, 56.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 39.2%us, 7.5%sy, 0.0%ni, 52.9%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu8 : 35.5%us, 7.2%sy, 0.0%ni, 56.9%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu9 : 37.8%us, 7.6%sy, 0.0%ni, 54.3%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu10 : 39.5%us, 5.9%sy, 0.0%ni, 54.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 : 34.5%us, 7.2%sy, 0.0%ni, 58.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu12 : 41.1%us, 6.9%sy, 0.0%ni, 50.3%id, 0.0%wa, 0.0%hi, 1.6%si, 0.0%st Cpu13 : 38.0%us, 7.3%sy, 0.0%ni, 54.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 36.2%us, 6.2%sy, 0.0%ni, 57.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 36.8%us, 8.2%sy, 0.0%ni, 54.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 28817004k total, 8008372k used, 20808632k free, 705772k buffers Swap: 30867448k total, 0k used, 30867448k free, 4848376k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2641 postgres 16 0 8524m 21m 16m S 3.3 0.1 0:00.25 postmaster 2650 postgres 15 0 8524m 21m 16m S 3.3 0.1 0:00.25 postmaster 2706 postgres 16 0 8524m 21m 15m S 3.3 0.1 0:00.20 postmaster 2814 postgres 15 0 8523m 18m 14m S 3.3 0.1 0:00.10 postmaster 2829 postgres 15 0 8523m 18m 14m S 3.3 0.1 0:00.10 postmaster 2618 postgres 15 0 8524m 21m 16m S 3.0 0.1 0:00.25 postmaster 2639 postgres 15 0 8524m 21m 16m R 3.0 0.1 0:00.25 postmaster 2671 postgres 15 0 8524m 21m 16m S 3.0 0.1 0:00.23 postmaster 2675 postgres 16 0 8524m 21m 16m S 3.0 0.1 0:00.23 postmaster 2694 postgres 15 0 8524m 21m 15m S 3.0 0.1 0:00.23 postmaster 2698 postgres 15 0 8524m 21m 15m S 3.0 0.1 0:00.21 postmaster 2702 postgres 15 0 8524m 21m 15m S 3.0 0.1 0:00.19 postmaster 2767 postgres 15 0 8524m 20m 14m S 3.0 0.1 0:00.13 postmaster 2776 postgres 15 0 8524m 20m 14m S 3.0 0.1 0:00.14 postmaster 2812 postgres 15 0 8523m 18m 14m S 3.0 0.1 0:00.11 postmaster 2819 postgres 15 0 8523m 18m 14m S 3.0 0.1 0:00.09 postmaster 2823 postgres 16 0 8523m 18m 14m S 3.0 0.1 0:00.09 postmaster 2828 postgres 15 0 8523m 18m 14m S 3.0 0.1 0:00.09 postmaster 2631 postgres 15 0 8524m 21m 15m S 2.6 0.1 0:00.24 postmaster 2643 postgres 15 0 8524m 21m 15m S 2.6 0.1 0:00.23 postmaster 2656 postgres 15 0 8524m 21m 15m S 2.6 0.1 0:00.22 postmaster 2658 postgres 16 0 8524m 21m 15m S 2.6 0.1 0:00.22 postmaster 2664 postgres 16 0 8524m 21m 16m S 2.6 0.1 0:00.24 postmaster 2674 postgres 15 0 8524m 21m 15m S 2.6 0.1 0:00.23 postmaster 2679 postgres 15 0 8524m 21m 15m S 2.6 0.1 0:00.22 postmaster 2684 postgres 15 0 8524m 21m 15m S 2.6 0.1 0:00.21 postmaster 2695 postgres 15 0 8524m 20m 15m S 2.6 0.1 0:00.18 postmaster 2699 postgres 15 0 8524m 20m 15m S 2.6 0.1 0:00.18 postmaster 2703 postgres 15 0 8524m 20m 15m S 2.6 0.1 0:00.18 postmaster 2704 postgres 15 0 8524m 20m 15m R 2.6 0.1 0:00.17 postmaster 2713 postgres 15 0 8524m 20m 15m S 2.6 0.1 0:00.18 postmaster 2734 postgres 15 0 8524m 20m 14m S 2.6 0.1 0:00.14 postmaster 2738 postgres 15 0 8524m 20m 15m S 2.6 0.1 0:00.14 postmaster > What you're looking for is iowait / utilization. >