Hello desmodemone, i look again and again through my sar statistics and i
don't think that my db swapping in freeze time. For example:

sar -B
12:00:02 AM  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s
pgscand/s pgsteal/s    %vmeff
09:40:01 PM     66.13    352.43 195070.33      0.00  70627.21
0.00      0.00      0.00      0.00
09:45:01 PM     54.55    526.87 190893.02      0.00  67850.63
25.76      2.40     28.16    100.00
09:50:01 PM     74.97    509.70 200564.75      0.00  71807.80
0.00      0.00      0.00      0.00
09:55:01 PM     81.54    443.57 186335.43      0.05  68486.83    127.33
160.26    287.35     99.92
10:00:01 PM     62.03    528.46 169701.41      0.00  60487.37      0.00
15.62     15.62    100.00
10:05:01 PM     64.61    504.76 178725.60      0.00  66251.26      0.00
15.80     15.80    100.00
10:10:01 PM     80.06    336.47 172819.14      0.00  62379.45
0.00      0.00      0.00      0.00
10:15:01 PM     59.69    512.85 180228.56      0.00  64091.90
0.00      0.00      0.00      0.00

sar -S
12:00:02 AM kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
09:40:01 PM   4095420       572      0.01       252     44.06
09:45:01 PM   4095420       572      0.01       252     44.06
09:50:01 PM   4095420       572      0.01       252     44.06
09:55:01 PM   4095420       572      0.01       252     44.06
10:00:01 PM   4095420       572      0.01       252     44.06
10:05:01 PM   4095420       572      0.01       252     44.06
10:10:01 PM   4095420       572      0.01       252     44.06
10:15:01 PM   4095420       572      0.01       252     44.06

In thist time as you can see swap usage didn't change at all. And there is
dedicated server for postgresql, there are no more app on this server, except
pacemaker+corosync for HA cluster. May be i read my sar statistics
incorrect?

I set work_mem to 1/4 from available RAM. I have 32Gb RAM so i set
shared_buffers to 8Gb.

Now i also set

vm.dirty_bytes=67108864 this value equal my Smart Array E200i Cache Size.
vm.dirty_background_bytes = 16777216 - 1/4 from vm.dirty_bytes

Next step try to set correct values for:
bgwriter_delay
bgwriter_lru_maxpages
bgwriter_lru_multiplier


And one more fact, if i cleanup fs cache. sync && echo 1 >
/proc/sys/vm/drop_caches. OS releases about 10-12Gb memory and freeze time
comes when fs cache comes again full. For example  one or two day there is
no freeze on DB.


2013/11/4 desmodemone <desmodem...@gmail.com>

> Hello,
>              I see your request on performance mailing list. I think your
> server is swapping  and because yoru swap is in the same RAID disk with all
> (/ , database satastore etc ) you ncounter a freeze of system.
>
> I think you have to analyze why you are swapping. Are ther eonly
> postgresql inside ? is it possible you are using too much work_mem memory ?
>
>
> Have a nice day
>
>
> 2013/11/2 Евгений Селявка <evg.selya...@gmail.com>
>
>> Please help with advice!
>>
>> Server
>> HP ProLiant BL460c G1
>>
>> Architecture:          x86_64
>> CPU op-mode(s):        32-bit, 64-bit
>> Byte Order:            Little Endian
>> CPU(s):                8
>> On-line CPU(s) list:   0-7
>> Thread(s) per core:    1
>> Core(s) per socket:    4
>> CPU socket(s):         2
>> NUMA node(s):          1
>> Vendor ID:             GenuineIntel
>> CPU family:            6
>> Model:                 23
>> Stepping:              6
>> CPU MHz:               3000.105
>> BogoMIPS:              6000.04
>> Virtualization:        VT-x
>> L1d cache:             32K
>> L1i cache:             32K
>> L2 cache:              6144K
>> NUMA node0 CPU(s):     0-7
>>
>> 32GB RAM
>> [root@db3 ~]# numactl --hardware
>> available: 1 nodes (0)
>> node 0 cpus: 0 1 2 3 4 5 6 7
>> node 0 size: 32765 MB
>> node 0 free: 317 MB
>> node distances:
>> node   0
>>   0:  10
>>
>>
>> RAID1 2x146GB 10k rpm
>>
>> CentOS release 6.3 (Final)
>> Linux 2.6.32-279.11.1.el6.x86_64 #1 SMP x86_64 GNU/Linux
>>
>>
>> kernel.msgmnb = 65536
>> kernel.msgmax = 65536
>> kernel.shmmax = 68719476736
>> kernel.shmall = 4294967296
>> vm.swappiness = 30
>> vm.dirty_background_bytes = 67108864
>> vm.dirty_bytes = 536870912
>>
>>
>> PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6
>> 20120305 (Red Hat 4.4.6-4), 64-bit
>>
>> listen_addresses = '*'
>> port = 5433
>> max_connections = 350
>> shared_buffers = 8GB
>> temp_buffers = 64MB
>> max_prepared_transactions = 350
>> work_mem = 256MB
>> maintenance_work_mem = 1GB
>> max_stack_depth = 4MB
>> max_files_per_process = 5000
>> effective_io_concurrency = 2
>> wal_level = hot_standby
>> synchronous_commit = off
>> checkpoint_segments = 64
>> checkpoint_timeout = 15min
>> checkpoint_completion_target = 0.75
>> max_wal_senders = 4
>> wal_sender_delay = 100ms
>> wal_keep_segments = 128
>> random_page_cost = 3.0
>> effective_cache_size = 18GB
>> autovacuum = on
>> autovacuum_max_workers = 5
>> autovacuum_vacuum_threshold = 900
>> autovacuum_analyze_threshold = 350
>> autovacuum_vacuum_scale_factor = 0.1
>> autovacuum_analyze_scale_factor = 0.05
>> log_min_duration_statement = 500
>> deadlock_timeout = 1s
>>
>>
>> DB size is about 20GB. There is no high write activity on DB. But
>> periodically in postgresql log i see for example: "select 1" duration is
>> about 500-1000 ms.
>>
>> In this period of time response time from db terribly. This period of
>> time not bound with high traffic. It is not other app on the server.
>> There is not specific cron job on server.
>>
>> Our app written on java and use jdbc to connect to DB and internal
>> pooling. There is about 100 connection to DB. This is sar output:
>>
>> 12:00:01 AM  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s
>> pgscand/s pgsteal/s    %vmeff
>> 09:30:01 PM     73.17    302.72 134790.16      0.00  46809.73
>> 0.00      0.00      0.00      0.00
>> 09:35:01 PM     63.42    655.80 131740.74      0.00  46182.74
>> 0.00      0.00      0.00      0.00
>> 09:40:01 PM     76.87    400.62 122375.34      0.00  42096.27
>> 0.00      0.00      0.00      0.00
>> 09:45:01 PM     58.49    198.33 121922.86      0.00  42765.27
>> 0.00      0.00      0.00      0.00
>> 09:50:01 PM     52.21    485.45 136775.65      0.15  49098.65
>> 0.00      0.00      0.00      0.00
>> 09:55:01 PM     49.68    476.75 130159.24      0.00  45192.54
>> 0.00      0.00      0.00      0.00
>> 10:00:01 PM     41.35    295.34 118655.80      0.00  40786.52
>> 0.00      0.00      0.00      0.00
>> 10:05:01 PM     60.84    593.85 129890.83      0.00  44170.92
>> 0.00      0.00      0.00      0.00
>> 10:10:01 PM     52.08    471.36 132773.63      0.00  46019.13
>> 0.00      2.41      2.41    100.00
>> 10:15:01 PM     73.93    196.50 129384.21      0.33  45255.76
>> 65.92      1.19     66.87     99.64
>> 10:20:02 PM     70.35    473.16 121940.38      0.11  44061.52
>> 81.95     37.79    119.42     99.73
>> 10:25:01 PM     57.84    471.69 130583.33      0.01  46093.33
>> 0.00      0.00      0.00      0.00
>> 10:30:01 PM     52.91    321.62 119264.34      0.01  41748.19
>> 0.00      0.00      0.00      0.00
>> 10:35:01 PM     47.13    451.78 114625.62      0.02  40600.98
>> 0.00      0.00      0.00      0.00
>> 10:40:01 PM     48.96    472.41 102352.79      0.00  35402.17
>> 0.00      0.00      0.00      0.00
>> 10:45:01 PM     70.07    321.33 121423.02      0.00  43052.04
>> 0.00      0.00      0.00      0.00
>> 10:50:01 PM     46.78    479.95 128938.09      0.02  37864.07
>> 116.64     48.97    165.07     99.67
>> 10:55:02 PM    104.84    453.55 109189.98      0.00  37583.50
>> 0.00      0.00      0.00      0.00
>> 11:00:01 PM     46.23    248.75 107313.26      0.00  37278.10
>> 0.00      0.00      0.00      0.00
>> 11:05:01 PM     44.28    446.41 115598.61      0.01  40070.61
>> 0.00      0.00      0.00      0.00
>> 11:10:01 PM     38.86    457.32 100240.71      0.00  34407.29
>> 0.00      0.00      0.00      0.00
>> 11:15:01 PM     48.23    275.60 104780.84      0.00  36183.84
>> 0.00      0.00      0.00      0.00
>> 11:20:01 PM     92.74    432.49 114698.74      0.01  40413.14
>> 0.00      0.00      0.00      0.00
>> 11:25:01 PM     42.76    428.50  87769.28      0.00  29379.87
>> 0.00      0.00      0.00      0.00
>> 11:30:01 PM     36.83    260.34  85072.46      0.00  28234.50
>> 0.00      0.00      0.00      0.00
>> 11:35:01 PM     62.52    481.56  93150.67      0.00  31137.13
>> 0.00      0.00      0.00      0.00
>> 11:40:01 PM     43.50    459.10  90407.34      0.00  30241.70
>> 0.00      0.00      0.00      0.00
>>
>> 12:00:01 AM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit
>> %commit
>> 09:30:01 PM    531792  32345400     98.38    475504  29583340
>> 10211064     27.62
>> 09:35:01 PM    512096  32365096     98.44    475896  29608660
>> 10200916     27.59
>> 09:40:01 PM    455584  32421608     98.61    476276  29638952
>> 10211652     27.62
>> 09:45:01 PM    425744  32451448     98.71    476604  29662384
>> 10206044     27.60
>> 09:50:01 PM    380960  32496232     98.84    477004  29684296
>> 10243704     27.71
>> 09:55:01 PM    385644  32491548     98.83    477312  29706940
>> 10204776     27.60
>> 10:00:01 PM    348604  32528588     98.94    477672  29725476
>> 10228984     27.67
>> 10:05:01 PM    279216  32597976     99.15    478104  29751016
>> 10281748     27.81
>> 10:10:01 PM    255168  32622024     99.22    478220  29769924
>> 10247404     27.72
>> 10:15:01 PM    321188  32556004     99.02    475124  29721912
>> 10234500     27.68
>> 10:20:02 PM    441660  32435532     98.66    472336  29610476
>> 10246288     27.71
>> 10:25:01 PM    440636  32436556     98.66    472636  29634960
>> 10219940     27.64
>> 10:30:01 PM    469872  32407320     98.57    473016  29651476
>> 10208520     27.61
>> 10:35:01 PM    414540  32462652     98.74    473424  29672728
>> 10223964     27.65
>> 10:40:01 PM    354632  32522560     98.92    473772  29693016
>> 10247752     27.72
>> 10:45:01 PM    333708  32543484     98.98    474092  29720256
>> 10227204     27.66
>> 10:50:01 PM    528004  32349188     98.39    469396  29549832
>> 10219536     27.64
>> 10:55:02 PM    499068  32378124     98.48    469692  29587140
>> 10204836     27.60
>> 11:00:01 PM    462980  32414212     98.59    470032  29606764
>> 10235820     27.68
>> 11:05:01 PM    449540  32427652     98.63    470368  29626136
>> 10209788     27.61
>> 11:10:01 PM    419984  32457208     98.72    470772  29644248
>> 10214480     27.63
>> 11:15:01 PM    429900  32447292     98.69    471104  29664292
>> 10202344     27.59
>> 11:20:01 PM    394852  32482340     98.80    471528  29698052
>> 10207604     27.61
>> 11:25:01 PM    345328  32531864     98.95    471904  29717264
>> 10215632     27.63
>> 11:30:01 PM    368224  32508968     98.88    472236  29733544
>> 10206468     27.61
>> 11:35:01 PM    321800  32555392     99.02    472528  29758548
>> 10211820     27.62
>> 11:40:01 PM    282520  32594672     99.14    472860  29776952
>> 10243516     27.71
>>
>> 12:00:01 AM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz
>> avgqu-sz     await     svctm     %util
>> 09:30:01 PM  dev253-5     66.29    146.33    483.33      9.50
>> 6.27     94.53      2.08     13.78
>> 09:35:01 PM  dev253-5    154.80    126.85   1192.96      8.53
>> 28.57    184.59      1.45     22.43
>> 09:40:01 PM  dev253-5     92.21    153.75    686.75      9.11
>> 11.53    125.00      1.87     17.21
>> 09:45:01 PM  dev253-5     39.66    116.99    279.32      9.99
>> 0.42     10.66      2.61     10.36
>> 09:50:01 PM  dev253-5    106.73     95.58    820.70      8.58
>> 16.77    157.12      1.68     17.88
>> 09:55:01 PM  dev253-5    107.90     99.36    831.46      8.63
>> 16.05    148.76      1.71     18.42
>> 10:00:01 PM  dev253-5     62.48     82.70    471.28      8.87
>> 5.91     94.52      2.10     13.11
>> 10:05:01 PM  dev253-5    137.84    121.69   1064.03      8.60
>> 24.48    177.31      1.56     21.52
>> 10:10:01 PM  dev253-5    107.93    104.16    827.83      8.64     16.69
>> 155.04      1.68     18.11
>> 10:15:01 PM  dev253-5     40.55    126.12    277.57      9.96
>> 0.41     10.13      2.57     10.42
>> 10:20:02 PM  dev253-5    104.33    136.77    793.49      8.92
>> 16.97    162.69      1.76     18.35
>> 10:25:01 PM  dev253-5    108.04    115.36    825.26      8.71     16.68
>> 154.36      1.76     19.05
>> 10:30:01 PM  dev253-5     69.72    105.66    523.05      9.02
>> 7.45    106.92      1.90     13.25
>> 10:35:01 PM  dev253-5    101.58     91.59    781.85      8.60
>> 15.00    147.68      1.67     16.97
>> 10:40:01 PM  dev253-5    107.50     97.91    827.17      8.61
>> 17.68    164.49      1.77     19.06
>> 10:45:01 PM  dev253-5     69.98    140.13    519.57      9.43
>> 7.09    101.25      1.96     13.72
>> 10:50:01 PM  dev253-5    104.30     83.31    806.12      8.53
>> 16.18    155.10      1.65     17.16
>> 10:55:02 PM  dev253-5    106.86    209.65    790.27      9.36
>> 15.59    145.08      1.74     18.60
>> 11:00:01 PM  dev253-5     50.42     92.08    371.52      9.19
>> 3.05     62.16      2.28     11.52
>> 11:05:01 PM  dev253-5    101.06     88.31    776.57      8.56
>> 15.12    149.58      1.67     16.90
>> 11:10:01 PM  dev253-5    103.08     77.73    798.23      8.50
>> 17.14    166.25      1.74     17.90
>> 11:15:01 PM  dev253-5     57.74     96.45    428.62      9.09
>> 5.23     90.52      2.13     12.32
>> 11:20:01 PM  dev253-5     97.73    185.18    727.38      9.34
>> 14.64    149.84      1.94     18.92
>> 11:25:01 PM  dev253-5     95.03     85.52    730.31      8.58     14.42
>> 151.79      1.79     16.97
>> 11:30:01 PM  dev253-5     53.76     73.65    404.47      8.89
>> 3.94     73.25      2.17     11.64
>> 11:35:01 PM  dev253-5    110.37    125.05    842.17      8.76
>> 16.96    153.63      1.66     18.30
>> 11:40:01 PM  dev253-5    103.93     87.00    801.59      8.55
>> 16.01    154.00      1.73     18.00
>>
>> As you can see there is no high io activity in this period of time but db
>> is frozen. My opinion that i have incorrect kernel setting and/or i have
>> a mistake in postgresql.conf. Because there is not high activity on db.
>> load avg is about 1. When there is high traffic is about 1.15. This is from
>> nagios monitoring system.
>>
>> But sometimes load is about 4 and this time matches with sar %vmeff =
>> 100% and database response time increase.
>>
>> --
>> С уважением Селявка Евгений
>>
>
>


-- 
С уважением Селявка Евгений

Reply via email to