Re: [PERFORM] WAL Optimisation - configuration and usage
random_page_cost = 0.5 Not likely. The lowest this value should ever be is 1, and thats if you're using something like a ram drive. If you're drives are doing a ton of extra random IO due to the above (rather than sequential reads) it would lower the throughput quite a bit. Try a value of 2 for a while. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] WAL Optimisation - configuration and usage
Neil, Actually, the manual is correct: in 7.4 and earlier releases, enabling wal_debug can be done without also setting a compile-time #ifdef. As of current CVS HEAD, the WAL_DEBUG #ifdef must be defined before this variable is available. Hmmm. I was told that it was this way for 7.4 as well; that's why it's in the docs that way. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] WAL Optimisation - configuration and usage
Josh Berkus wrote: Hmmm. I was told that it was this way for 7.4 as well; that's why it's in the docs that way. No such statement is made in the docs AFAIK: they merely say If nonzero, turn on WAL-related debugging output. I invented a new #ifdef symbol when making this change in CVS HEAD, so I think you are misremembering. -Neil ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] WAL Optimisation - configuration and usage
Neil Conway Simon Riggs wrote: Josh Berkus wrote Simon Riggs wrote Please set WAL_DEBUG to 1 so we can see a bit more info: thanks. I'm pretty sure that WAL_DEBUG requires a compile-time option. I'm surprised, but you are right, the manual does SAY this requires a compile time option; it is unfortunately not correct. Actually, the manual is correct: in 7.4 and earlier releases, enabling wal_debug can be done without also setting a compile-time #ifdef. As of current CVS HEAD, the WAL_DEBUG #ifdef must be defined before this variable is available. Touche! I stand corrected, thank you both. My suggestion does work for Rob, then. [This also implies I have a screwed version on my machine, so thank you also for flushing that lurking issue out for me. I'd had a suspicion for a few weeks. Lucky I'm still just prototyping.] On the other hand, I was just about to change the wal_debug behaviour to allow better debugging of PITR features as they're added. I think it is very important to be able to put the system fairly easily into debug mode; a recompile is easy enough, but it would be even better to avoid this completely. This would mean reversing the change you describe: here's the design: The behaviour I wish to add is: Keep wal_debug as a value between 0 and 16. If =0 then no debug output (default). Use following bitmasks against the value Mask 1 = XLOG Checkpoints get logged Mask 2 = Archive API calls get logged Mask 4 = Transaction - commits get logged Mask 8 = Flush INSERTs get logged That way it should be fairly straightforward to control the amount and type of information available to administrators. The existing design produces too much info to be easily usable, mostly requiring a perl program to filter out the info overload and do record counts. This suggested design allows you to control the volume of messages, since the bitmasks are arranged in volume/frequency order and brings the wal_debug option back into something useful for problem diagnosis on live systems, not just hacking the code. Anybody object to these mods, or have better/different ideas? Getting the diagnostics right is fairly important, IMHO, to making PITR become real. Best regards, Simon Riggs ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] WAL Optimisation - configuration and usage
Simon Riggs wrote: Josh Berkus wrote Simon Riggs wrote Please set WAL_DEBUG to 1 so we can see a bit more info: thanks. I'm pretty sure that WAL_DEBUG requires a compile-time option. I'm surprised, but you are right, the manual does SAY this requires a compile time option; it is unfortunately not correct. Actually, the manual is correct: in 7.4 and earlier releases, enabling wal_debug can be done without also setting a compile-time #ifdef. As of current CVS HEAD, the WAL_DEBUG #ifdef must be defined before this variable is available. -Neil ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] WAL Optimisation - configuration and usage
Further update to my WAL experimentation. pg_xlog files have increased to 81, and checking today up to 84. Currently nothing much going on with the server save a background process running a select every 30 seconds with almost no impact (according to IO from vmstats). This in itself is a good sign - an improvement on running last week, but I'd still like to get clarification on WAL file usage if possible. Log file tailing has nothing more interesting than a whole set of recycled transaction log file entries : 2004-03-01 16:01:55 DEBUG: recycled transaction log file 00710017 2004-03-01 16:07:01 DEBUG: recycled transaction log file 00710018 2004-03-01 16:17:14 DEBUG: recycled transaction log file 00710019 2004-03-01 16:22:20 DEBUG: recycled transaction log file 0071001A 2004-03-01 16:32:31 DEBUG: recycled transaction log file 0071001B 2004-03-01 16:37:36 DEBUG: recycled transaction log file 0071001C 2004-03-01 16:47:48 DEBUG: recycled transaction log file 0071001D 2004-03-01 16:52:54 DEBUG: recycled transaction log file 0071001E 2004-03-01 17:03:05 DEBUG: recycled transaction log file 0071001F Looks kinda automated, but the times aren't quite even at around 6-10 minutes apart. cheers, -- Rob Fielding [EMAIL PROTECTED] www.dsvr.co.uk Development Designer Servers Ltd ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] WAL Optimisation - configuration and usage
Rob Fielding wrote: My focus today has been on WAL - I've not looked at WAL before. By increasing the settings thus : wal_buffers = 64 # need to determin WAL usage wal_files = 64 # range 0-64 wal_sync_method = fsync # the default varies across platforms: wal_debug = 0 # range 0-16 # hopefully this should see less LogFlushes per LogInsert - use more WAL though. commit_delay = 1 # range 0-10 commit_siblings = 2 # range 1-1000 checkpoint_segments = 16 # in logfile segments (16MB each), min 1 checkpoint_timeout = 300 # in seconds, range 30-3600 fsync = true But I think we can get more out of this as the above setting were picked from thin air and my concern here is being able to determin WAL file usage and if the system is caught out on the other extreme that we're not commiting fast enough. Currently I've read that WAL files shouldn't be more than 2*checkpoint_segments+1 however my pg_xlog directory contains 74 files. This suggests I'm using more logfiles than I should. Also I'm not sure what wal_buffers really should be set to. As Richard Huxton says, we're all learning...I'm looking at WAL logic for other reasons right now... This is based upon my reading of the code; I think the manual contains at least one confusion that has not assisted your understanding (or mine): The WAL files limit of 2*checkpoint_segments+1 refers to the number of files allocated-in-advance of the current log, not the total number of files in use. pg uses a cycle of logs, reusing older ones when all the transactions in those log files have been checkpointed. The limit is set to allow checkpoint to release segments and have them all be reused at once. Pg stores them up for use again later when workload hots up again. If it cannot recycle a file because there is a still-current txn on the end of the cycle, then it will allocate a new file and use this instead, but still keeping everything in a cycle. Thus if transactions are particularly long running, then the number of files in the cycle will grow. So overall, normal behaviour so far. I don't think there's anything to worry about in having that many files in your xlog cycle. That behaviour is usually seen with occasional long running txns. When a long running transaction is over, pg will try to reduce the number of files in the cycle until its back to target. You seem to be reusing one file in the cycle every 10 mins - this is happening as the result of a checkpoint timeout - kinda automated as you say. [A checkpoint is the only time you can get the messages you're getting] At one file per checkpoint, it will take 16*2+1=33 checkpoints*10 mins = 5 hours before it hits the advance allocation file limit and then starts to reduce number of files. That's why they appear to stay constant... If you want to check whether this is correct, manually issue a number of CHECKPOINT statements. The messages should change from recycled to removing transaction log file once you've got to 33 checkpoints - the number of WAL log files should start to go down also? If so, then there's nothing too strange going on, just pg being a little slow in reducing the number of wal log files. So, it seems that you are running occasional very long transactions. During that period you run up to 60-80 wal files. That's just on the edge of your wal_buffers limit, which means you start to write wal quicker than you'd like past that point. Your checkpoint_timeout is 300 seconds, but a checkpoint will also be called every checkpoint_segments, or currently every 16 wal files. Since you go as high as 60-80 then you are checkpointing 4-5 times during the heavy transaction period - assuming it's all one block of work. In the end, each checkpoint is causing a huge I/O storm, during which not much work happens. I would suggest that you reduce the effect of checkpointing by either: - re-write app to do scan deletes in smaller chunks in quieter periods or - increase checkpoint_segments to 128, though this may effect your recoverability You can of course only do so much with the memory available to you. If you increase one allocation of memory, you may have to reduce another parameter and that may be counter productive. [An alternative view is that you should go for more frequent, not less frequent checkpoints in this situation, smoothing out the effect of the checkpoints, rather than trying to avoid them at all. On the other hand, that approach also increases total WAL log volume, which means you'll make poor use of I/O and memory buffering. I'd stay high.] However, I'm not sure - why checkpoint interval of 300 secs causes them to happen every 10 mins in quieter periods; is that an occaisional update occurring? - why checkpoint only releases single Wal file each time - but that maybe me just reading the code incorrectly. Please set WAL_DEBUG to 1 so we can see a bit more info: thanks. Can I get any feedback on this ? How to look into pg's WAL usage
Re: [PERFORM] WAL Optimisation - configuration and usage
Simon, Please set WAL_DEBUG to 1 so we can see a bit more info: thanks. I'm pretty sure that WAL_DEBUG requires a compile-time option. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] WAL Optimisation - configuration and usage
Rob Sir - I have to congratulate you on having the most coherently summarised and yet complex list query I have ever seen. I fear that I will be learning from this problem rather than helping, but one thing did puzzle me - you've set your random_page_cost to 0.5? I'm not sure this is sensible - you may be compensating for some other parameter out-of-range. -- Richard Huxton ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org