On Dec 11, 2012, at 2:51 AM, Niels Kristian Schjødt 
<nielskrist...@autouncle.com> wrote:

> #### Pitch 
> ######################################################################################
> I previously posted this question 
> http://archives.postgresql.org/pgsql-performance/2012-11/msg00289.php about a 
> performance issue with an update query. 
> The question evolved into a more general discussion about my setup, and about 
> a lot of I/O wait that I was encountering. Since then, I have gotten a whole 
> lot more familiar with measuring things, and now I "just" need some 
> experienced eyes to judge which direction I should go in - do I have a 
> hardware issue, or a software issue - and what action should I take?
> 
> #####  My setup 
> #############################################################################
> The use case:
> At night time we are doing a LOT of data maintenance, and hence the load on 
> the database is very different from the day time. However we would like to be 
> able to do some of it in the daytime, it's simply just too "heavy" on the 
> database as is right now. The stats shown below is from one of those "heavy" 
> load times.
> 
> Hardware: 
>   - 32Gb ram 
>   - 8 core Xeon E3-1245 processor
>   - Two SEAGATE ST33000650NS drives (called sdc and sdd in the stats) in a 
> softeware RAID1 array (called md2 in the stats)
>   - Two INTEL SSDSC2CW240A3 SSD drives (called sda and sdb in the stats) in a 
> software RAID1 (called md3 in the stats)
> 
> Software:
> Postgres 9.2 running on 64bit ubuntu 12.04 with kernel 3.2
> 
> Configuration:
> # postgresql.conf (a shortlist of everything changed from the default)
> data_directory = '/var/lib/postgresql/9.2/main'
> hba_file = '/etc/postgresql/9.2/main/pg_hba.conf'
> ident_file = '/etc/postgresql/9.2/main/pg_ident.conf'
> external_pid_file = '/var/run/postgresql/9.2-main.pid'
> listen_addresses = '192.168.0.2, localhost'
> port = 5432
> max_connections = 300
> unix_socket_directory = '/var/run/postgresql'
> wal_level = hot_standby
> synchronous_commit = off
> archive_mode = on
> archive_command = 'rsync -a %p 
> postgres@192.168.0.4:/var/lib/postgresql/9.2/wals/%f </dev/null'
> max_wal_senders = 1
> wal_keep_segments = 32
> log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d '
> datestyle = 'iso, mdy'
> lc_monetary = 'en_US.UTF-8'
> lc_numeric = 'en_US.UTF-8'
> lc_time = 'en_US.UTF-8'
> default_text_search_config = 'pg_catalog.english'
> default_statistics_target = 100
> maintenance_work_mem = 1GB
> checkpoint_completion_target = 0.9
> effective_cache_size = 22GB
> work_mem = 160MB
> wal_buffers = 4MB
> checkpoint_segments = 100
> shared_buffers = 4GB
> checkpoint_timeout = 10min
> 
> The kernel has bee tweaked like so:
> vm.dirty_ratio = 10
> vm.dirty_background_ratio = 1
> kernel.shmmax = 8589934592
> kernel.shmall = 17179869184
> 
> The pg_xlog folder has been moved onto the SSD array (md3), and symlinked 
> back into the postgres dir.
> 

Actually, you should move xlog to rotating drives, since wal logs written 
sequentially, and everything else to ssd, because of random io pattern.


> ##### The stats 
> ###############################################################
> These are the typical observations/stats I see in one of these periods:
> 
> 1)
> At top level this is what I see in new relic:
> https://rpm.newrelic.com/public/charts/6ewGRle6bmc
> 
> 2)
> When the database is loaded like this, I see a lot of queries talking up to 
> 1000 times as long, as they would when the database is not loaded so heavily.
> 
> 3)
> sudo iostat -dmx (typical usage)
> Linux 3.2.0-33-generic (master-db)    12/10/2012      _x86_64_        (8 CPU)
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await r_await w_await  svctm  %util
> sda               0.00     6.52    3.59   26.61     0.22     0.74    65.49    
>  0.01    0.40    0.77    0.35   0.14   0.43
> sdb               0.00     8.31    0.03   28.38     0.00     0.97    69.63    
>  0.01    0.52    0.27    0.52   0.15   0.43
> sdc               1.71    46.01   34.83  116.62     0.56     4.06    62.47    
>  1.90   12.57   21.81    9.81   1.89  28.66
> sdd               1.67    46.14   34.89  116.49     0.56     4.06    62.46    
>  1.58   10.43   21.66    7.07   1.89  28.60
> md1               0.00     0.00    0.00    0.00     0.00     0.00     2.69    
>  0.00    0.00    0.00    0.00   0.00   0.00
> md0               0.00     0.00    0.11    0.24     0.00     0.00     8.00    
>  0.00    0.00    0.00    0.00   0.00   0.00
> md2               0.00     0.00   72.99  161.95     1.11     4.06    45.10    
>  0.00    0.00    0.00    0.00   0.00   0.00
> md3               0.00     0.00    0.05   32.32     0.00     0.74    47.00    
>  0.00    0.00    0.00    0.00   0.00   0.00
> 
> 3)
> sudo iotop -oa (running for about a minute or so)
> TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
>   292    be/4 root               0.00 B      0.00 B    0.00 % 99.33 % 
> [md2_raid1]
>  2815  be/4 postgres     19.51 M     25.90 M  0.00 % 45.49 % postgres: 
> autovacuum worker process   production
> 32553 be/4 postgres     45.74 M      9.38 M  0.00 % 37.89 % postgres: user 
> production 192.168.0.3(58866) UPDATE
> 32570 be/4 postgres      6.91 M     35.02 M  0.00 % 16.71 % postgres: user 
> production 192.168.0.3(35547) idle
> 32575 be/4 postgres      4.06 M     43.90 M  0.00 % 16.62 % postgres: user 
> production 192.168.0.3(35561) SELECT
> 31673 be/4 postgres      4.14 M     52.16 M  0.00 % 16.24 % postgres: user 
> production 192.168.0.3(39112) idle
> 32566 be/4 postgres      4.73 M     44.95 M  0.00 % 15.66 % postgres: user 
> production 192.168.0.3(35531) idle
> 32568 be/4 postgres      4.50 M     33.84 M  0.00 % 14.62 % postgres: user 
> production 192.168.0.3(35543) SELECT
> 32573 be/4 postgres      3.20 M     34.44 M  0.00 % 13.98 % postgres: user 
> production 192.168.0.3(35559) idle
> 31590 be/4 postgres      3.23 M     29.72 M  0.00 % 13.90 % postgres: user 
> production 192.168.0.3(50690) idle in transaction
> 32577 be/4 postgres      5.09 M     25.54 M  0.00 % 13.63 % postgres: user 
> production 192.168.0.3(35563) idle
> 32565 be/4 postgres      2.06 M     35.93 M  0.00 % 13.41 % postgres: user 
> production 192.168.0.3(35529) SELECT
> 32546 be/4 postgres      4.48 M     36.49 M  0.00 % 13.39 % postgres: user 
> production 192.168.0.3(56927) UPDATE waiting
> 32569 be/4 postgres      3.50 M     26.75 M  0.00 % 12.82 % postgres: user 
> production 192.168.0.3(35545) INSERT
> 31671 be/4 postgres      4.58 M     24.45 M  0.00 % 12.76 % postgres: user 
> production 192.168.0.3(34841) idle in transaction
> 32551 be/4 postgres      3.26 M     31.77 M  0.00 % 12.06 % postgres: user 
> production 192.168.0.3(58864) idle in transaction
> 32574 be/4 postgres      5.32 M     32.92 M  0.00 % 11.70 % postgres: user 
> production 192.168.0.3(35560) idle
> 32572 be/4 postgres      3.00 M     32.66 M  0.00 % 11.66 % postgres: user 
> production 192.168.0.3(35558) UPDATE
> 32560 be/4 postgres      5.12 M     25.89 M  0.00 % 11.52 % postgres: user 
> production 192.168.0.3(33886) SELECT
> 32567 be/4 postgres      4.66 M     36.47 M  0.00 % 11.44 % postgres: user 
> production 192.168.0.3(35534) SELECT
> 32571 be/4 postgres      2.86 M     31.27 M  0.00 % 11.31 % postgres: user 
> production 192.168.0.3(35557) SELECT
> 32552 be/4 postgres      4.38 M     28.75 M  0.00 % 11.09 % postgres: user 
> production 192.168.0.3(58865) idle in transaction
> 32554 be/4 postgres      3.69 M     30.21 M  0.00 % 10.90 % postgres: user 
> production 192.168.0.3(58870) UPDATE
>   339    be/3 root               0.00 B       2.29 M  0.00 %  9.81 % 
> [jbd2/md2-8]
> 32576 be/4 postgres      3.37 M     19.91 M  0.00 %  9.73 % postgres: user 
> production 192.168.0.3(35562) idle
> 32555 be/4 postgres      3.09 M     31.96 M  0.00 %  9.02 % postgres: user 
> production 192.168.0.3(58875) SELECT
> 27548 be/4 postgres      0.00 B     97.12 M  0.00 %  7.41 % postgres: writer 
> process
> 31445 be/4 postgres    924.00 K     27.35 M  0.00 %  7.11 % postgres: user 
> production 192.168.0.1(34536) idle
> 31443 be/4 postgres      2.54 M      4.56 M  0.00 %  6.32 % postgres: user 
> production 192.168.0.1(34508) idle
> 31459 be/4 postgres   1480.00 K     21.36 M  0.00 %  5.63 % postgres: user 
> production 192.168.0.1(34543) idle
>  1801 be/4 postgres   1896.00 K     10.89 M  0.00 %  5.57 % postgres: user 
> production 192.168.0.3(34177) idle
> 32763 be/4 postgres   1696.00 K      6.95 M  0.00 %  5.33 % postgres: user 
> production 192.168.0.3(57984) SELECT
>  1800 be/4 postgres      2.46 M      5.13 M  0.00 %  5.24 % postgres: user 
> production 192.168.0.3(34175) SELECT
>  1803 be/4 postgres   1816.00 K      9.09 M  0.00 %  5.16 % postgres: user 
> production 192.168.0.3(34206) idle
> 32578 be/4 postgres      2.57 M     11.62 M  0.00 %  5.06 % postgres: user 
> production 192.168.0.3(35564) SELECT
> 31440 be/4 postgres      3.02 M      4.04 M  0.00 %  4.65 % postgres: user 
> production 192.168.0.1(34463) idle
> 32605 be/4 postgres   1844.00 K     11.82 M  0.00 %  4.49 % postgres: user 
> production 192.168.0.3(40399) idle
> 27547 be/4 postgres      0.00 B      0.00 B  0.00 %  3.93 % postgres: 
> checkpointer process
> 31356 be/4 postgres   1368.00 K      3.27 M  0.00 %  3.93 % postgres: user 
> production 192.168.0.1(34450) idle
> 32542 be/4 postgres   1180.00 K      6.05 M  0.00 %  3.90 % postgres: user 
> production 192.168.0.3(56859) idle
> 32523 be/4 postgres   1088.00 K      4.33 M  0.00 %  3.59 % postgres: user 
> production 192.168.0.3(48164) idle
> 32606 be/4 postgres   1964.00 K      6.94 M  0.00 %  3.51 % postgres: user 
> production 192.168.0.3(40426) SELECT
> 31466 be/4 postgres   1596.00 K      3.11 M  0.00 %  3.47 % postgres: user 
> production 192.168.0.1(34550) idle
> 32544 be/4 postgres   1184.00 K      4.25 M  0.00 %  3.38 % postgres: user 
> production 192.168.0.3(56861) idle
> 31458 be/4 postgres   1088.00 K   1528.00 K  0.00 %  3.33 % postgres: user 
> production 192.168.0.1(34541) idle
> 31444 be/4 postgres    884.00 K      4.23 M  0.00 %  3.27 % postgres: user 
> production 192.168.0.1(34510) idle
> 32522 be/4 postgres    408.00 K      2.98 M  0.00 %  3.27 % postgres: user 
> production 192.168.0.5(38361) idle
> 32762 be/4 postgres   1156.00 K      5.28 M  0.00 %  3.20 % postgres: user 
> production 192.168.0.3(57962) idle
> 32582 be/4 postgres   1084.00 K      3.38 M  0.00 %  2.86 % postgres: user 
> production 192.168.0.5(43104) idle
> 31353 be/4 postgres      2.04 M      3.02 M  0.00 %  2.82 % postgres: user 
> production 192.168.0.1(34444) idle
> 31441 be/4 postgres    700.00 K      2.68 M  0.00 %  2.64 % postgres: user 
> production 192.168.0.1(34465) idle
> 31462 be/4 postgres    980.00 K      3.50 M  0.00 %  2.57 % postgres: user 
> production 192.168.0.1(34547) idle
> 32709 be/4 postgres    428.00 K      3.23 M  0.00 %  2.56 % postgres: user 
> production 192.168.0.5(34323) idle
>   685 be/4 postgres    748.00 K      3.59 M  0.00 %  2.41 % postgres: user 
> production 192.168.0.3(34911) idle
>   683 be/4 postgres    728.00 K      3.19 M  0.00 %  2.38 % postgres: user 
> production 192.168.0.3(34868) idle
> 32765 be/4 postgres    464.00 K      3.76 M  0.00 %  2.21 % postgres: user 
> production 192.168.0.3(58074) idle
> 32760 be/4 postgres    808.00 K      6.18 M  0.00 %  2.16 % postgres: user 
> production 192.168.0.3(57958) idle
>  1912 be/4 postgres    372.00 K      3.03 M  0.00 %  2.16 % postgres: user 
> production 192.168.0.5(33743) idle
> 31446 be/4 postgres   1004.00 K      2.09 M  0.00 %  2.16 % postgres: user 
> production 192.168.0.1(34539) idle
> 31460 be/4 postgres    584.00 K      2.74 M  0.00 %  2.10 % postgres: user 
> production 192.168.0.1(34545) idle
> 
> 5) vmstat 1
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
>  1  1 573424 321080  27124 28504352    0    0   143   618    0    4  2  0 91  
> 7
>  0  1 573424 320764  27124 28504496    0    0   104 15654 3788 4961  1  0 85 
> 14
>  0  1 573424 320684  27124 28504616    0    0   276 12736 4099 5374  0  1 84 
> 15
>  0  1 573424 319672  27124 28504900    0    0    80  7746 3624 4949  2  0 82 
> 16
>  0  1 573424 319180  27124 28504972    0    0    36 12489 3653 4761  2  0 86 
> 12
>  0  1 573424 318184  27132 28505000    0    0     8 10482 3413 4898  0  0 87 
> 13
>  0  1 573424 318424  27132 28505016    0    0     0  9564 2746 4290  0  0 87 
> 13
>  0  1 573424 318308  27168 28505016    0    0    36 10562 1895 2149  0  0 87 
> 12
>  0  3 573424 318208  27168 28505020    0    0    84 18529 3035 3265  1  0 85 
> 14
>  0  1 573424 318732  27176 28505080    0    0    84 14574 2986 3231  0  0 84 
> 16
>  0  2 573424 317588  27176 28505184    0    0     4  6681 1991 2207  2  1 86 
> 12
>  0  1 573424 316852  27176 28505260    0    0    76  7670 2910 3996  2  1 85 
> 13
>  0  1 573424 316632  27184 28505256    0    0     0  7186 2661 3740  0  0 87 
> 12
>  0  1 573424 316720  27188 28505260    0    0     0  2590 1731 2474  0  0 88 
> 12
>  0  1 573424 314252  27192 28505696    0    0   460 11612 1757 2431  0  0 82 
> 18
>  0  2 573424 313504  27192 28505724    0    0     0 19656 1775 2099  0  0 83 
> 17
>  0  3 573424 313300  27196 28505780    0    0   188  6237 2746 3193  2  0 80 
> 17
>  0  2 573424 312736  27200 28506348    0    0   804 18466 5014 6430  2  1 75 
> 23
>  2 35 573424 307564  27200 28509920    0    0  3912 16280 14377 15470 14  3 
> 28 56
>  0  5 573424 282848  27208 28533964    0    0  7484 27580 22017 25938 17  3 
> 17 63
>  1  5 573424 221100  27208 28563360    0    0  2852  3120 19639 28664 12  5 
> 52 31
>  0  4 573428 229912  26704 28519184    0    4  1208  5890 13976 20851 13  3 
> 56 28
>  0  2 573448 234680  26672 28513632    0   20     0 17204 1694 2636  0  0 71 
> 28
>  3  7 573452 220836  26644 28525548    0    4  1540 36370 27928 36551 17  5 
> 50 27
>  1  3 573488 234380  26556 28517416    0   36   584 19066 8275 9467  3  2 60 
> 36
>  0  1 573488 234496  26556 28517852    0    0    56 47429 3290 4310  0  0 79 
> 20
> 
> 6) sudo lsof - a hell of a lot of output, I can post it if anyone is 
> interested :-)
> 
> #### Notes and thoughts  
> ##############################################################################
> 
> As you can see, even though I have moved the pg_xlog folder to the SSD array 
> (md3) the by far largest amount of writes still goes to the regular HDD's 
> (md2), which puzzles me - what can that be?
> From stat 3) (the iostat) I notice that the SSD's doesn't seem to be 
> something near fully utilized - maybe something else than just pg_xlog could 
> be moved her? 
> I have no idea if the amount of reads/writes is  within the 
> acceptable/capable for my kind of hardware, or if it is far beyond?
> In stat 3) (the iotop) it says that the RAID array (md2) is the most 
> "waiting" part, does that taste like a root cause, or more like a symptom of 
> some other bottleneck?
> 
> Thanks, for taking the time to look at by data! :-)

Reply via email to