Please help with advice!

Server
HP ProLiant BL460c G1

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    1
Core(s) per socket:    4
CPU socket(s):         2
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 23
Stepping:              6
CPU MHz:               3000.105
BogoMIPS:              6000.04
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              6144K
NUMA node0 CPU(s):     0-7

32GB RAM
[root@db3 ~]# numactl --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 32765 MB
node 0 free: 317 MB
node distances:
node   0
  0:  10


RAID1 2x146GB 10k rpm

CentOS release 6.3 (Final)
Linux 2.6.32-279.11.1.el6.x86_64 #1 SMP x86_64 GNU/Linux


kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
vm.swappiness = 30
vm.dirty_background_bytes = 67108864
vm.dirty_bytes = 536870912


PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6
20120305 (Red Hat 4.4.6-4), 64-bit

listen_addresses = '*'
port = 5433
max_connections = 350
shared_buffers = 8GB
temp_buffers = 64MB
max_prepared_transactions = 350
work_mem = 256MB
maintenance_work_mem = 1GB
max_stack_depth = 4MB
max_files_per_process = 5000
effective_io_concurrency = 2
wal_level = hot_standby
synchronous_commit = off
checkpoint_segments = 64
checkpoint_timeout = 15min
checkpoint_completion_target = 0.75
max_wal_senders = 4
wal_sender_delay = 100ms
wal_keep_segments = 128
random_page_cost = 3.0
effective_cache_size = 18GB
autovacuum = on
autovacuum_max_workers = 5
autovacuum_vacuum_threshold = 900
autovacuum_analyze_threshold = 350
autovacuum_vacuum_scale_factor = 0.1
autovacuum_analyze_scale_factor = 0.05
log_min_duration_statement = 500
deadlock_timeout = 1s


DB size is about 20GB. There is no high write activity on DB. But
periodically in postgresql log i see for example: "select 1" duration is
about 500-1000 ms.

In this period of time response time from db terribly. This period of time not
bound with high traffic. It is not other app on the server. There is not
specific cron job on server.

Our app written on java and use jdbc to connect to DB and internal pooling.
There is about 100 connection to DB. This is sar output:

12:00:01 AM  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s
pgscand/s pgsteal/s    %vmeff
09:30:01 PM     73.17    302.72 134790.16      0.00  46809.73
0.00      0.00      0.00      0.00
09:35:01 PM     63.42    655.80 131740.74      0.00  46182.74
0.00      0.00      0.00      0.00
09:40:01 PM     76.87    400.62 122375.34      0.00  42096.27
0.00      0.00      0.00      0.00
09:45:01 PM     58.49    198.33 121922.86      0.00  42765.27
0.00      0.00      0.00      0.00
09:50:01 PM     52.21    485.45 136775.65      0.15  49098.65
0.00      0.00      0.00      0.00
09:55:01 PM     49.68    476.75 130159.24      0.00  45192.54
0.00      0.00      0.00      0.00
10:00:01 PM     41.35    295.34 118655.80      0.00  40786.52
0.00      0.00      0.00      0.00
10:05:01 PM     60.84    593.85 129890.83      0.00  44170.92
0.00      0.00      0.00      0.00
10:10:01 PM     52.08    471.36 132773.63      0.00  46019.13
0.00      2.41      2.41    100.00
10:15:01 PM     73.93    196.50 129384.21      0.33  45255.76
65.92      1.19     66.87     99.64
10:20:02 PM     70.35    473.16 121940.38      0.11  44061.52     81.95
37.79    119.42     99.73
10:25:01 PM     57.84    471.69 130583.33      0.01  46093.33
0.00      0.00      0.00      0.00
10:30:01 PM     52.91    321.62 119264.34      0.01  41748.19
0.00      0.00      0.00      0.00
10:35:01 PM     47.13    451.78 114625.62      0.02  40600.98
0.00      0.00      0.00      0.00
10:40:01 PM     48.96    472.41 102352.79      0.00  35402.17
0.00      0.00      0.00      0.00
10:45:01 PM     70.07    321.33 121423.02      0.00  43052.04
0.00      0.00      0.00      0.00
10:50:01 PM     46.78    479.95 128938.09      0.02  37864.07    116.64
48.97    165.07     99.67
10:55:02 PM    104.84    453.55 109189.98      0.00  37583.50
0.00      0.00      0.00      0.00
11:00:01 PM     46.23    248.75 107313.26      0.00  37278.10
0.00      0.00      0.00      0.00
11:05:01 PM     44.28    446.41 115598.61      0.01  40070.61
0.00      0.00      0.00      0.00
11:10:01 PM     38.86    457.32 100240.71      0.00  34407.29
0.00      0.00      0.00      0.00
11:15:01 PM     48.23    275.60 104780.84      0.00  36183.84
0.00      0.00      0.00      0.00
11:20:01 PM     92.74    432.49 114698.74      0.01  40413.14
0.00      0.00      0.00      0.00
11:25:01 PM     42.76    428.50  87769.28      0.00  29379.87
0.00      0.00      0.00      0.00
11:30:01 PM     36.83    260.34  85072.46      0.00  28234.50
0.00      0.00      0.00      0.00
11:35:01 PM     62.52    481.56  93150.67      0.00  31137.13
0.00      0.00      0.00      0.00
11:40:01 PM     43.50    459.10  90407.34      0.00  30241.70
0.00      0.00      0.00      0.00

12:00:01 AM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit
%commit
09:30:01 PM    531792  32345400     98.38    475504  29583340  10211064
27.62
09:35:01 PM    512096  32365096     98.44    475896  29608660  10200916
27.59
09:40:01 PM    455584  32421608     98.61    476276  29638952  10211652
27.62
09:45:01 PM    425744  32451448     98.71    476604  29662384  10206044
27.60
09:50:01 PM    380960  32496232     98.84    477004  29684296  10243704
27.71
09:55:01 PM    385644  32491548     98.83    477312  29706940  10204776
27.60
10:00:01 PM    348604  32528588     98.94    477672  29725476  10228984
27.67
10:05:01 PM    279216  32597976     99.15    478104  29751016  10281748
27.81
10:10:01 PM    255168  32622024     99.22    478220  29769924  10247404
27.72
10:15:01 PM    321188  32556004     99.02    475124  29721912  10234500
27.68
10:20:02 PM    441660  32435532     98.66    472336  29610476  10246288
27.71
10:25:01 PM    440636  32436556     98.66    472636  29634960  10219940
27.64
10:30:01 PM    469872  32407320     98.57    473016  29651476  10208520
27.61
10:35:01 PM    414540  32462652     98.74    473424  29672728  10223964
27.65
10:40:01 PM    354632  32522560     98.92    473772  29693016  10247752
27.72
10:45:01 PM    333708  32543484     98.98    474092  29720256  10227204
27.66
10:50:01 PM    528004  32349188     98.39    469396  29549832  10219536
27.64
10:55:02 PM    499068  32378124     98.48    469692  29587140  10204836
27.60
11:00:01 PM    462980  32414212     98.59    470032  29606764  10235820
27.68
11:05:01 PM    449540  32427652     98.63    470368  29626136  10209788
27.61
11:10:01 PM    419984  32457208     98.72    470772  29644248  10214480
27.63
11:15:01 PM    429900  32447292     98.69    471104  29664292  10202344
27.59
11:20:01 PM    394852  32482340     98.80    471528  29698052  10207604
27.61
11:25:01 PM    345328  32531864     98.95    471904  29717264  10215632
27.63
11:30:01 PM    368224  32508968     98.88    472236  29733544  10206468
27.61
11:35:01 PM    321800  32555392     99.02    472528  29758548  10211820
27.62
11:40:01 PM    282520  32594672     99.14    472860  29776952  10243516
27.71

12:00:01 AM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz
await     svctm     %util
09:30:01 PM  dev253-5     66.29    146.33    483.33      9.50      6.27
94.53      2.08     13.78
09:35:01 PM  dev253-5    154.80    126.85   1192.96      8.53     28.57
184.59      1.45     22.43
09:40:01 PM  dev253-5     92.21    153.75    686.75      9.11     11.53
125.00      1.87     17.21
09:45:01 PM  dev253-5     39.66    116.99    279.32      9.99      0.42
10.66      2.61     10.36
09:50:01 PM  dev253-5    106.73     95.58    820.70      8.58     16.77
157.12      1.68     17.88
09:55:01 PM  dev253-5    107.90     99.36    831.46      8.63     16.05
148.76      1.71     18.42
10:00:01 PM  dev253-5     62.48     82.70    471.28      8.87      5.91
94.52      2.10     13.11
10:05:01 PM  dev253-5    137.84    121.69   1064.03      8.60     24.48
177.31      1.56     21.52
10:10:01 PM  dev253-5    107.93    104.16    827.83      8.64     16.69
155.04      1.68     18.11
10:15:01 PM  dev253-5     40.55    126.12    277.57      9.96      0.41
10.13      2.57     10.42
10:20:02 PM  dev253-5    104.33    136.77    793.49      8.92     16.97
162.69      1.76     18.35
10:25:01 PM  dev253-5    108.04    115.36    825.26      8.71     16.68
154.36      1.76     19.05
10:30:01 PM  dev253-5     69.72    105.66    523.05      9.02      7.45
106.92      1.90     13.25
10:35:01 PM  dev253-5    101.58     91.59    781.85      8.60     15.00
147.68      1.67     16.97
10:40:01 PM  dev253-5    107.50     97.91    827.17      8.61     17.68
164.49      1.77     19.06
10:45:01 PM  dev253-5     69.98    140.13    519.57      9.43      7.09
101.25      1.96     13.72
10:50:01 PM  dev253-5    104.30     83.31    806.12      8.53     16.18
155.10      1.65     17.16
10:55:02 PM  dev253-5    106.86    209.65    790.27      9.36     15.59
145.08      1.74     18.60
11:00:01 PM  dev253-5     50.42     92.08    371.52      9.19      3.05
62.16      2.28     11.52
11:05:01 PM  dev253-5    101.06     88.31    776.57      8.56     15.12
149.58      1.67     16.90
11:10:01 PM  dev253-5    103.08     77.73    798.23      8.50     17.14
166.25      1.74     17.90
11:15:01 PM  dev253-5     57.74     96.45    428.62      9.09      5.23
90.52      2.13     12.32
11:20:01 PM  dev253-5     97.73    185.18    727.38      9.34     14.64
149.84      1.94     18.92
11:25:01 PM  dev253-5     95.03     85.52    730.31      8.58     14.42
151.79      1.79     16.97
11:30:01 PM  dev253-5     53.76     73.65    404.47      8.89      3.94
73.25      2.17     11.64
11:35:01 PM  dev253-5    110.37    125.05    842.17      8.76     16.96
153.63      1.66     18.30
11:40:01 PM  dev253-5    103.93     87.00    801.59      8.55     16.01
154.00      1.73     18.00

As you can see there is no high io activity in this period of time but db
is frozen. My opinion that i have incorrect kernel setting and/or i have a
mistake in postgresql.conf. Because there is not high activity on db. load
avg is about 1. When there is high traffic is about 1.15. This is from
nagios monitoring system.

But sometimes load is about 4 and this time matches with sar %vmeff = 100%
and database response time increase.

-- 
С уважением Селявка Евгений

Reply via email to