[PERFORM] WAL Optimisation - configuration and usage

Rob Fielding Sat, 28 Feb 2004 10:40:58 -0800

Hi,

There alot here, so skip to the middle from my WAL settings if you like.

I'm currently investigating the performance on a large database which consumes email designated as SPAM for the perusal of customers wishing to check. This incorporates a number of subprocesses - several delivery daemons, an expiry daemon and a UI which performs large selects. A considerable amount of UPDATE, SELECT and DELETE are performed continually.

Starting with a stock pg config, I've well understood the importance increased shared mem, effective cache size and low random_page_cost as detailed in http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html. After some system analysis with vmstat and sar we've been able to determin that the main problem is IO bound and IMO this is due to lots of updates requiring high drive contention - the array is a RAID0 mirror and the dataset originally 79GB. Alot of SPAM is being sent from our mail scanners and coupled with the UI is creating an increasingly lagging system.

Typically all our db servers have these sort of enhancements - 1GB ram, SMP boxen with SCSI 160 disks : effective_cache_size = 95694 random_page_cost = 0.5 sort_mem=65536 max_connections = 128 shared_buffers = 15732

My focus today has been on WAL - I've not looked at WAL before. By increasing the settings thus :

wal_buffers = 64 # need to determin WAL usage
wal_files = 64 # range 0-64
wal_sync_method = fsync   # the default varies across platforms:
wal_debug = 0             # range 0-16

# hopefully this should see less LogFlushes per LogInsert - use more WAL though. commit_delay = 10000 # range 0-100000 commit_siblings = 2 # range 1-1000 checkpoint_segments = 16 # in logfile segments (16MB each), min 1 checkpoint_timeout = 300 # in seconds, range 30-3600 fsync = true

great improvements have been seen. A vacuumdb -f -a -z went from processing 1 table in 10 minutes to 10 tables in 1 minute. :) I actually stopped it after 80 tables (48 hours runtime) because the projected end time would have been next week. Once I restarted the postmaster with the above WAL settings, vacuumdb -f -a -z completed all 650 tables by the following day.

My thinking is therefore to reduce disk context switching as best as possible within the current hardware limitiations. I'm aiming at keeping the checkpoint subprocess happy that other backends are about to commit - hence keep siblings low at 2 - and create a sufficient gap between internal commital so many commits can be done in a single sync. From the above config, I believe I've gone some way to acheive this and the performance I'm now seeing suggests this.

But I think we can get more out of this as the above setting were picked from thin air and my concern here is being able to determin WAL file usage and if the system is caught out on the other extreme that we're not commiting fast enough. Currently I've read that WAL files shouldn't be more than 2*checkpoint_segments+1 however my pg_xlog directory contains 74 files. This suggests I'm using more logfiles than I should. Also I'm not sure what wal_buffers really should be set to.

Can I get any feedback on this ? How to look into pg's WAL usage would be what I'm looking for. BTW this is an old install I'm afraid 7.2.2 - it's been impossible to upgrade up until now because it's been too slow. I have moved the pg_xlog onto the root SCSI disk - it doesn't appear to have made a huge difference but it could be on the same cable.

Additional information as a bit of background : I can supply sar output if required. I'm currently running our expiry daemon which scans all mail for each domain (ie each table) and this seems to take a few hours to run on a 26GB archive. It's alot faster than it ever was. Load gets to about 8 as backends are all busy doing selects, updates and deletes. This process has recently already been run so it shouldn't be doing too much deleting. Still seems IO bound, and I don't think I'm going to solve that without a better disk arrangement, but this is essentially what I'm doing now - exhausting other possibilities.

$ sar -B -s 16:00:00

16:35:55     pgpgin/s pgpgout/s  activepg  inadtypg  inaclnpg  inatarpg
16:36:00      3601.60    754.40    143492     87791     10230     48302
16:36:05      5766.40    552.80    143947     88039     10170     48431
16:36:10      3663.20    715.20    144578     88354      9075     48401
16:36:15      3634.40    412.00    144335     88405      9427     48433
16:36:20      5578.40    447.20    143626     88545      9817     48397
16:36:25      4154.40    469.60    143640     88654     10388     48536
16:36:30      3504.00    635.20    143538     88763      9992     48458
16:36:35      3540.80    456.00    142515     88949     10444     48381
16:36:40      3334.40   1067.20    143268     89244      9832     48468

$ vmstat 5 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 7 1 29588 10592 15700 809060 1 0 97 75 0 103 13 9 79 3 8 0 29588 11680 15736 807620 0 0 3313 438 1838 3559 19 13 68 2 13 1 29588 12808 15404 800328 0 0 4470 445 1515 1752 7 7 86 0 9 1 29588 10992 15728 806476 0 0 2933 781 1246 2686 14 10 76 2 5 1 29588 11336 15956 807884 0 0 3354 662 1773 5211 27 17 57 4 5 0 29696 13072 16020 813872 0 24 4282 306 2632 7862 45 25 31 4 6 1 29696 10400 16116 815084 0 0 5086 314 2668 7893 47 26 27 9 2 1 29696 13060 16308 814232 27 0 3927 748 2586 7836 48 29 23 3 8 1 29696 10444 16232 812816 3 0 4015 433 2443 7180 47 28 25 8 4 0 29696 10904 16432 812488 0 0 4537 500 2616 8418 46 30 24 4 6 2 29696 11048 16320 810276 0 0 6076 569 1893 3919 20 14 66 0 5 0 29696 10480 16600 813788 0 0 4595 435 2400 6215 33 21 46 3 6 0 29696 10536 16376 812248 0 0 3802 504 2417 7921 43 25 32 1 6 1 29696 11236 16500 809636 0 0 3691 357 2171 5199 24 15 61 0 14 1 29696 10228 16036 801368 0 0 4038 561 1566 3288 16 12 72

Sorry it's so long but I thought some brief info would be better than not. Thanks for reading,

--

Rob Fielding
Development
Designer Servers Ltd


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

[PERFORM] WAL Optimisation - configuration and usage

Reply via email to