On Dec 11, 2012, at 2:51 AM, Niels Kristian Schjødt <nielskrist...@autouncle.com> wrote:
> #### Pitch > ###################################################################################### > I previously posted this question > http://archives.postgresql.org/pgsql-performance/2012-11/msg00289.php about a > performance issue with an update query. > The question evolved into a more general discussion about my setup, and about > a lot of I/O wait that I was encountering. Since then, I have gotten a whole > lot more familiar with measuring things, and now I "just" need some > experienced eyes to judge which direction I should go in - do I have a > hardware issue, or a software issue - and what action should I take? > > ##### My setup > ############################################################################# > The use case: > At night time we are doing a LOT of data maintenance, and hence the load on > the database is very different from the day time. However we would like to be > able to do some of it in the daytime, it's simply just too "heavy" on the > database as is right now. The stats shown below is from one of those "heavy" > load times. > > Hardware: > - 32Gb ram > - 8 core Xeon E3-1245 processor > - Two SEAGATE ST33000650NS drives (called sdc and sdd in the stats) in a > softeware RAID1 array (called md2 in the stats) > - Two INTEL SSDSC2CW240A3 SSD drives (called sda and sdb in the stats) in a > software RAID1 (called md3 in the stats) > > Software: > Postgres 9.2 running on 64bit ubuntu 12.04 with kernel 3.2 > > Configuration: > # postgresql.conf (a shortlist of everything changed from the default) > data_directory = '/var/lib/postgresql/9.2/main' > hba_file = '/etc/postgresql/9.2/main/pg_hba.conf' > ident_file = '/etc/postgresql/9.2/main/pg_ident.conf' > external_pid_file = '/var/run/postgresql/9.2-main.pid' > listen_addresses = '192.168.0.2, localhost' > port = 5432 > max_connections = 300 > unix_socket_directory = '/var/run/postgresql' > wal_level = hot_standby > synchronous_commit = off > archive_mode = on > archive_command = 'rsync -a %p > postgres@192.168.0.4:/var/lib/postgresql/9.2/wals/%f </dev/null' > max_wal_senders = 1 > wal_keep_segments = 32 > log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d ' > datestyle = 'iso, mdy' > lc_monetary = 'en_US.UTF-8' > lc_numeric = 'en_US.UTF-8' > lc_time = 'en_US.UTF-8' > default_text_search_config = 'pg_catalog.english' > default_statistics_target = 100 > maintenance_work_mem = 1GB > checkpoint_completion_target = 0.9 > effective_cache_size = 22GB > work_mem = 160MB > wal_buffers = 4MB > checkpoint_segments = 100 > shared_buffers = 4GB > checkpoint_timeout = 10min > > The kernel has bee tweaked like so: > vm.dirty_ratio = 10 > vm.dirty_background_ratio = 1 > kernel.shmmax = 8589934592 > kernel.shmall = 17179869184 > > The pg_xlog folder has been moved onto the SSD array (md3), and symlinked > back into the postgres dir. > Actually, you should move xlog to rotating drives, since wal logs written sequentially, and everything else to ssd, because of random io pattern. > ##### The stats > ############################################################### > These are the typical observations/stats I see in one of these periods: > > 1) > At top level this is what I see in new relic: > https://rpm.newrelic.com/public/charts/6ewGRle6bmc > > 2) > When the database is loaded like this, I see a lot of queries talking up to > 1000 times as long, as they would when the database is not loaded so heavily. > > 3) > sudo iostat -dmx (typical usage) > Linux 3.2.0-33-generic (master-db) 12/10/2012 _x86_64_ (8 CPU) > > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sda 0.00 6.52 3.59 26.61 0.22 0.74 65.49 > 0.01 0.40 0.77 0.35 0.14 0.43 > sdb 0.00 8.31 0.03 28.38 0.00 0.97 69.63 > 0.01 0.52 0.27 0.52 0.15 0.43 > sdc 1.71 46.01 34.83 116.62 0.56 4.06 62.47 > 1.90 12.57 21.81 9.81 1.89 28.66 > sdd 1.67 46.14 34.89 116.49 0.56 4.06 62.46 > 1.58 10.43 21.66 7.07 1.89 28.60 > md1 0.00 0.00 0.00 0.00 0.00 0.00 2.69 > 0.00 0.00 0.00 0.00 0.00 0.00 > md0 0.00 0.00 0.11 0.24 0.00 0.00 8.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > md2 0.00 0.00 72.99 161.95 1.11 4.06 45.10 > 0.00 0.00 0.00 0.00 0.00 0.00 > md3 0.00 0.00 0.05 32.32 0.00 0.74 47.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > > 3) > sudo iotop -oa (running for about a minute or so) > TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND > 292 be/4 root 0.00 B 0.00 B 0.00 % 99.33 % > [md2_raid1] > 2815 be/4 postgres 19.51 M 25.90 M 0.00 % 45.49 % postgres: > autovacuum worker process production > 32553 be/4 postgres 45.74 M 9.38 M 0.00 % 37.89 % postgres: user > production 192.168.0.3(58866) UPDATE > 32570 be/4 postgres 6.91 M 35.02 M 0.00 % 16.71 % postgres: user > production 192.168.0.3(35547) idle > 32575 be/4 postgres 4.06 M 43.90 M 0.00 % 16.62 % postgres: user > production 192.168.0.3(35561) SELECT > 31673 be/4 postgres 4.14 M 52.16 M 0.00 % 16.24 % postgres: user > production 192.168.0.3(39112) idle > 32566 be/4 postgres 4.73 M 44.95 M 0.00 % 15.66 % postgres: user > production 192.168.0.3(35531) idle > 32568 be/4 postgres 4.50 M 33.84 M 0.00 % 14.62 % postgres: user > production 192.168.0.3(35543) SELECT > 32573 be/4 postgres 3.20 M 34.44 M 0.00 % 13.98 % postgres: user > production 192.168.0.3(35559) idle > 31590 be/4 postgres 3.23 M 29.72 M 0.00 % 13.90 % postgres: user > production 192.168.0.3(50690) idle in transaction > 32577 be/4 postgres 5.09 M 25.54 M 0.00 % 13.63 % postgres: user > production 192.168.0.3(35563) idle > 32565 be/4 postgres 2.06 M 35.93 M 0.00 % 13.41 % postgres: user > production 192.168.0.3(35529) SELECT > 32546 be/4 postgres 4.48 M 36.49 M 0.00 % 13.39 % postgres: user > production 192.168.0.3(56927) UPDATE waiting > 32569 be/4 postgres 3.50 M 26.75 M 0.00 % 12.82 % postgres: user > production 192.168.0.3(35545) INSERT > 31671 be/4 postgres 4.58 M 24.45 M 0.00 % 12.76 % postgres: user > production 192.168.0.3(34841) idle in transaction > 32551 be/4 postgres 3.26 M 31.77 M 0.00 % 12.06 % postgres: user > production 192.168.0.3(58864) idle in transaction > 32574 be/4 postgres 5.32 M 32.92 M 0.00 % 11.70 % postgres: user > production 192.168.0.3(35560) idle > 32572 be/4 postgres 3.00 M 32.66 M 0.00 % 11.66 % postgres: user > production 192.168.0.3(35558) UPDATE > 32560 be/4 postgres 5.12 M 25.89 M 0.00 % 11.52 % postgres: user > production 192.168.0.3(33886) SELECT > 32567 be/4 postgres 4.66 M 36.47 M 0.00 % 11.44 % postgres: user > production 192.168.0.3(35534) SELECT > 32571 be/4 postgres 2.86 M 31.27 M 0.00 % 11.31 % postgres: user > production 192.168.0.3(35557) SELECT > 32552 be/4 postgres 4.38 M 28.75 M 0.00 % 11.09 % postgres: user > production 192.168.0.3(58865) idle in transaction > 32554 be/4 postgres 3.69 M 30.21 M 0.00 % 10.90 % postgres: user > production 192.168.0.3(58870) UPDATE > 339 be/3 root 0.00 B 2.29 M 0.00 % 9.81 % > [jbd2/md2-8] > 32576 be/4 postgres 3.37 M 19.91 M 0.00 % 9.73 % postgres: user > production 192.168.0.3(35562) idle > 32555 be/4 postgres 3.09 M 31.96 M 0.00 % 9.02 % postgres: user > production 192.168.0.3(58875) SELECT > 27548 be/4 postgres 0.00 B 97.12 M 0.00 % 7.41 % postgres: writer > process > 31445 be/4 postgres 924.00 K 27.35 M 0.00 % 7.11 % postgres: user > production 192.168.0.1(34536) idle > 31443 be/4 postgres 2.54 M 4.56 M 0.00 % 6.32 % postgres: user > production 192.168.0.1(34508) idle > 31459 be/4 postgres 1480.00 K 21.36 M 0.00 % 5.63 % postgres: user > production 192.168.0.1(34543) idle > 1801 be/4 postgres 1896.00 K 10.89 M 0.00 % 5.57 % postgres: user > production 192.168.0.3(34177) idle > 32763 be/4 postgres 1696.00 K 6.95 M 0.00 % 5.33 % postgres: user > production 192.168.0.3(57984) SELECT > 1800 be/4 postgres 2.46 M 5.13 M 0.00 % 5.24 % postgres: user > production 192.168.0.3(34175) SELECT > 1803 be/4 postgres 1816.00 K 9.09 M 0.00 % 5.16 % postgres: user > production 192.168.0.3(34206) idle > 32578 be/4 postgres 2.57 M 11.62 M 0.00 % 5.06 % postgres: user > production 192.168.0.3(35564) SELECT > 31440 be/4 postgres 3.02 M 4.04 M 0.00 % 4.65 % postgres: user > production 192.168.0.1(34463) idle > 32605 be/4 postgres 1844.00 K 11.82 M 0.00 % 4.49 % postgres: user > production 192.168.0.3(40399) idle > 27547 be/4 postgres 0.00 B 0.00 B 0.00 % 3.93 % postgres: > checkpointer process > 31356 be/4 postgres 1368.00 K 3.27 M 0.00 % 3.93 % postgres: user > production 192.168.0.1(34450) idle > 32542 be/4 postgres 1180.00 K 6.05 M 0.00 % 3.90 % postgres: user > production 192.168.0.3(56859) idle > 32523 be/4 postgres 1088.00 K 4.33 M 0.00 % 3.59 % postgres: user > production 192.168.0.3(48164) idle > 32606 be/4 postgres 1964.00 K 6.94 M 0.00 % 3.51 % postgres: user > production 192.168.0.3(40426) SELECT > 31466 be/4 postgres 1596.00 K 3.11 M 0.00 % 3.47 % postgres: user > production 192.168.0.1(34550) idle > 32544 be/4 postgres 1184.00 K 4.25 M 0.00 % 3.38 % postgres: user > production 192.168.0.3(56861) idle > 31458 be/4 postgres 1088.00 K 1528.00 K 0.00 % 3.33 % postgres: user > production 192.168.0.1(34541) idle > 31444 be/4 postgres 884.00 K 4.23 M 0.00 % 3.27 % postgres: user > production 192.168.0.1(34510) idle > 32522 be/4 postgres 408.00 K 2.98 M 0.00 % 3.27 % postgres: user > production 192.168.0.5(38361) idle > 32762 be/4 postgres 1156.00 K 5.28 M 0.00 % 3.20 % postgres: user > production 192.168.0.3(57962) idle > 32582 be/4 postgres 1084.00 K 3.38 M 0.00 % 2.86 % postgres: user > production 192.168.0.5(43104) idle > 31353 be/4 postgres 2.04 M 3.02 M 0.00 % 2.82 % postgres: user > production 192.168.0.1(34444) idle > 31441 be/4 postgres 700.00 K 2.68 M 0.00 % 2.64 % postgres: user > production 192.168.0.1(34465) idle > 31462 be/4 postgres 980.00 K 3.50 M 0.00 % 2.57 % postgres: user > production 192.168.0.1(34547) idle > 32709 be/4 postgres 428.00 K 3.23 M 0.00 % 2.56 % postgres: user > production 192.168.0.5(34323) idle > 685 be/4 postgres 748.00 K 3.59 M 0.00 % 2.41 % postgres: user > production 192.168.0.3(34911) idle > 683 be/4 postgres 728.00 K 3.19 M 0.00 % 2.38 % postgres: user > production 192.168.0.3(34868) idle > 32765 be/4 postgres 464.00 K 3.76 M 0.00 % 2.21 % postgres: user > production 192.168.0.3(58074) idle > 32760 be/4 postgres 808.00 K 6.18 M 0.00 % 2.16 % postgres: user > production 192.168.0.3(57958) idle > 1912 be/4 postgres 372.00 K 3.03 M 0.00 % 2.16 % postgres: user > production 192.168.0.5(33743) idle > 31446 be/4 postgres 1004.00 K 2.09 M 0.00 % 2.16 % postgres: user > production 192.168.0.1(34539) idle > 31460 be/4 postgres 584.00 K 2.74 M 0.00 % 2.10 % postgres: user > production 192.168.0.1(34545) idle > > 5) vmstat 1 > procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy id wa > 1 1 573424 321080 27124 28504352 0 0 143 618 0 4 2 0 91 > 7 > 0 1 573424 320764 27124 28504496 0 0 104 15654 3788 4961 1 0 85 > 14 > 0 1 573424 320684 27124 28504616 0 0 276 12736 4099 5374 0 1 84 > 15 > 0 1 573424 319672 27124 28504900 0 0 80 7746 3624 4949 2 0 82 > 16 > 0 1 573424 319180 27124 28504972 0 0 36 12489 3653 4761 2 0 86 > 12 > 0 1 573424 318184 27132 28505000 0 0 8 10482 3413 4898 0 0 87 > 13 > 0 1 573424 318424 27132 28505016 0 0 0 9564 2746 4290 0 0 87 > 13 > 0 1 573424 318308 27168 28505016 0 0 36 10562 1895 2149 0 0 87 > 12 > 0 3 573424 318208 27168 28505020 0 0 84 18529 3035 3265 1 0 85 > 14 > 0 1 573424 318732 27176 28505080 0 0 84 14574 2986 3231 0 0 84 > 16 > 0 2 573424 317588 27176 28505184 0 0 4 6681 1991 2207 2 1 86 > 12 > 0 1 573424 316852 27176 28505260 0 0 76 7670 2910 3996 2 1 85 > 13 > 0 1 573424 316632 27184 28505256 0 0 0 7186 2661 3740 0 0 87 > 12 > 0 1 573424 316720 27188 28505260 0 0 0 2590 1731 2474 0 0 88 > 12 > 0 1 573424 314252 27192 28505696 0 0 460 11612 1757 2431 0 0 82 > 18 > 0 2 573424 313504 27192 28505724 0 0 0 19656 1775 2099 0 0 83 > 17 > 0 3 573424 313300 27196 28505780 0 0 188 6237 2746 3193 2 0 80 > 17 > 0 2 573424 312736 27200 28506348 0 0 804 18466 5014 6430 2 1 75 > 23 > 2 35 573424 307564 27200 28509920 0 0 3912 16280 14377 15470 14 3 > 28 56 > 0 5 573424 282848 27208 28533964 0 0 7484 27580 22017 25938 17 3 > 17 63 > 1 5 573424 221100 27208 28563360 0 0 2852 3120 19639 28664 12 5 > 52 31 > 0 4 573428 229912 26704 28519184 0 4 1208 5890 13976 20851 13 3 > 56 28 > 0 2 573448 234680 26672 28513632 0 20 0 17204 1694 2636 0 0 71 > 28 > 3 7 573452 220836 26644 28525548 0 4 1540 36370 27928 36551 17 5 > 50 27 > 1 3 573488 234380 26556 28517416 0 36 584 19066 8275 9467 3 2 60 > 36 > 0 1 573488 234496 26556 28517852 0 0 56 47429 3290 4310 0 0 79 > 20 > > 6) sudo lsof - a hell of a lot of output, I can post it if anyone is > interested :-) > > #### Notes and thoughts > ############################################################################## > > As you can see, even though I have moved the pg_xlog folder to the SSD array > (md3) the by far largest amount of writes still goes to the regular HDD's > (md2), which puzzles me - what can that be? > From stat 3) (the iostat) I notice that the SSD's doesn't seem to be > something near fully utilized - maybe something else than just pg_xlog could > be moved her? > I have no idea if the amount of reads/writes is within the > acceptable/capable for my kind of hardware, or if it is far beyond? > In stat 3) (the iotop) it says that the RAID array (md2) is the most > "waiting" part, does that taste like a root cause, or more like a symptom of > some other bottleneck? > > Thanks, for taking the time to look at by data! :-)