Re: [PERFORM] WAL Optimisation - configuration and usage

2004-06-09 Thread Rod Taylor
 random_page_cost = 0.5

Not likely. The lowest this value should ever be is 1, and thats if
you're using something like a ram drive.

If you're drives are doing a ton of extra random IO due to the above
(rather than sequential reads) it would lower the throughput quite a
bit.

Try a value of 2 for a while.



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] WAL Optimisation - configuration and usage

2004-03-03 Thread Josh Berkus
Neil,

 Actually, the manual is correct: in 7.4 and earlier releases, enabling 
 wal_debug can be done without also setting a compile-time #ifdef. As 
 of current CVS HEAD, the WAL_DEBUG #ifdef must be defined before this 
 variable is available.

Hmmm.   I was told that it was this way for 7.4 as well; that's why it's in 
the docs that way.


-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] WAL Optimisation - configuration and usage

2004-03-03 Thread Neil Conway
Josh Berkus wrote:
Hmmm.   I was told that it was this way for 7.4 as well; that's why it's in 
the docs that way.
No such statement is made in the docs AFAIK: they merely say If 
nonzero, turn on WAL-related debugging output.

I invented a new #ifdef symbol when making this change in CVS HEAD, so 
I think you are misremembering.

-Neil

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] WAL Optimisation - configuration and usage

2004-03-03 Thread Simon Riggs
Neil Conway
 Simon Riggs wrote:
 Josh Berkus wrote
 
 Simon Riggs wrote
 Please set WAL_DEBUG to 1 so we can see a bit more info: thanks.
 
 I'm pretty sure that WAL_DEBUG requires a compile-time option.
 
  I'm surprised, but you are right, the manual does SAY this requires
a
  compile time option; it is unfortunately not correct.
 
 Actually, the manual is correct: in 7.4 and earlier releases, enabling
 wal_debug can be done without also setting a compile-time #ifdef. As
 of current CVS HEAD, the WAL_DEBUG #ifdef must be defined before this
 variable is available.

Touche! I stand corrected, thank you both. My suggestion does work for
Rob, then.

[This also implies I have a screwed version on my machine, so thank you
also for flushing that lurking issue out for me. I'd had a suspicion for
a few weeks. Lucky I'm still just prototyping.]

On the other hand, I was just about to change the wal_debug behaviour to
allow better debugging of PITR features as they're added. I think it is
very important to be able to put the system fairly easily into debug
mode; a recompile is easy enough, but it would be even better to avoid
this completely. This would mean reversing the change you describe:
here's the design:

The behaviour I wish to add is:
Keep wal_debug as a value between 0 and 16.
If =0 then no debug output (default).
Use following bitmasks against the value
Mask 1 = XLOG Checkpoints get logged
Mask 2 = Archive API calls get logged
Mask 4 = Transaction - commits get logged
Mask 8 = Flush  INSERTs get logged

That way it should be fairly straightforward to control the amount and
type of information available to administrators. The existing design
produces too much info to be easily usable, mostly requiring a perl
program to filter out the info overload and do record counts. This
suggested design allows you to control the volume of messages, since the
bitmasks are arranged in volume/frequency order and brings the wal_debug
option back into something useful for problem diagnosis on live systems,
not just hacking the code.

Anybody object to these mods, or have better/different ideas? Getting
the diagnostics right is fairly important, IMHO, to making PITR become
real.

Best regards, Simon Riggs


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] WAL Optimisation - configuration and usage

2004-03-02 Thread Neil Conway
Simon Riggs wrote:
Josh Berkus wrote

Simon Riggs wrote
Please set WAL_DEBUG to 1 so we can see a bit more info: thanks.
I'm pretty sure that WAL_DEBUG requires a compile-time option.
I'm surprised, but you are right, the manual does SAY this requires a
compile time option; it is unfortunately not correct.
Actually, the manual is correct: in 7.4 and earlier releases, enabling 
wal_debug can be done without also setting a compile-time #ifdef. As 
of current CVS HEAD, the WAL_DEBUG #ifdef must be defined before this 
variable is available.

-Neil

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] WAL Optimisation - configuration and usage

2004-03-01 Thread Rob Fielding
Further update to my WAL experimentation. pg_xlog files have increased 
to 81, and checking today up to 84. Currently nothing much going on with 
the server save a background process running a select every 30 seconds 
with almost no impact (according to IO from vmstats).

This in itself is a good sign - an improvement on running last week, but 
I'd still like to get clarification on WAL file usage if possible.

Log file tailing has nothing more interesting than a whole set of 
recycled transaction log file entries :

2004-03-01 16:01:55 DEBUG:  recycled transaction log file 00710017
2004-03-01 16:07:01 DEBUG:  recycled transaction log file 00710018
2004-03-01 16:17:14 DEBUG:  recycled transaction log file 00710019
2004-03-01 16:22:20 DEBUG:  recycled transaction log file 0071001A
2004-03-01 16:32:31 DEBUG:  recycled transaction log file 0071001B
2004-03-01 16:37:36 DEBUG:  recycled transaction log file 0071001C
2004-03-01 16:47:48 DEBUG:  recycled transaction log file 0071001D
2004-03-01 16:52:54 DEBUG:  recycled transaction log file 0071001E
2004-03-01 17:03:05 DEBUG:  recycled transaction log file 0071001F
Looks kinda automated, but the times aren't quite even at around 6-10 
minutes apart.

cheers,
--
Rob Fielding
[EMAIL PROTECTED]
www.dsvr.co.uk  Development Designer Servers Ltd

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] WAL Optimisation - configuration and usage

2004-03-01 Thread Simon Riggs
Rob Fielding wrote:
 My focus today has been on WAL - I've not looked at WAL before. By
 increasing the settings thus :
 
 wal_buffers = 64 # need to determin WAL usage
 wal_files = 64 # range 0-64
 wal_sync_method = fsync   # the default varies across platforms:
 wal_debug = 0 # range 0-16
 
 # hopefully this should see less LogFlushes per LogInsert - use more
WAL
 though.
 commit_delay = 1 # range 0-10
 commit_siblings = 2   # range 1-1000
 checkpoint_segments = 16   # in logfile segments (16MB each), min 1
 checkpoint_timeout = 300  # in seconds, range 30-3600
 fsync = true

 But I think we can get more out of this as the above setting were
picked
 from thin air  and my concern here is being able to determin WAL file
 usage and if the system is caught out on the other extreme that we're
 not commiting fast enough. Currently I've read that WAL files
shouldn't
 be more than 2*checkpoint_segments+1 however my pg_xlog directory
 contains 74 files. This suggests I'm using more logfiles than I
should.
 Also I'm not sure what wal_buffers really should be set to.

As Richard Huxton says, we're all learning...I'm looking at WAL logic
for other reasons right now...

This is based upon my reading of the code; I think the manual contains
at least one confusion that has not assisted your understanding (or
mine):
 
The WAL files limit of 2*checkpoint_segments+1 refers to the number of
files allocated-in-advance of the current log, not the total number of
files in use. pg uses a cycle of logs, reusing older ones when all the
transactions in those log files have been checkpointed. The limit is set
to allow checkpoint to release segments and have them all be reused at
once. Pg stores them up for use again later when workload hots up again.

If it cannot recycle a file because there is a still-current txn on the
end of the cycle, then it will allocate a new file and use this instead,
but still keeping everything in a cycle. Thus if transactions are
particularly long running, then the number of files in the cycle will
grow. So overall, normal behaviour so far. I don't think there's
anything to worry about in having that many files in your xlog cycle.

That behaviour is usually seen with occasional long running txns. When a
long running transaction is over, pg will try to reduce the number of
files in the cycle until its back to target. 

You seem to be reusing one file in the cycle every 10 mins - this is
happening as the result of a checkpoint timeout - kinda automated as
you say. [A checkpoint is the only time you can get the messages you're
getting] At one file per checkpoint, it will take 16*2+1=33
checkpoints*10 mins = 5 hours before it hits the advance allocation file
limit and then starts to reduce number of files. That's why they appear
to stay constant...

If you want to check whether this is correct, manually issue a number of
CHECKPOINT statements. The messages should change from recycled to
removing transaction log file once you've got to 33 checkpoints - the
number of WAL log files should start to go down also? If so, then
there's nothing too strange going on, just pg being a little slow in
reducing the number of wal log files.

So, it seems that you are running occasional very long transactions.
During that period you run up to 60-80 wal files. That's just on the
edge of your wal_buffers limit, which means you start to write wal
quicker than you'd like past that point. Your checkpoint_timeout is 300
seconds, but a checkpoint will also be called every checkpoint_segments,
or currently every 16 wal files. Since you go as high as 60-80 then you
are checkpointing 4-5 times during the heavy transaction period -
assuming it's all one block of work. In the end, each checkpoint is
causing a huge I/O storm, during which not much work happens. 

I would suggest that you reduce the effect of checkpointing by either:
- re-write app to do scan deletes in smaller chunks in quieter periods
or
- increase checkpoint_segments to 128, though this may effect your
recoverability

You can of course only do so much with the memory available to you. If
you increase one allocation of memory, you may have to reduce another
parameter and that may be counter productive.

[An alternative view is that you should go for more frequent, not less
frequent checkpoints in this situation, smoothing out the effect of the
checkpoints, rather than trying to avoid them at all. On the other hand,
that approach also increases total WAL log volume, which means you'll
make poor use of I/O and memory buffering. I'd stay high.]

However, I'm not sure 
- why checkpoint interval of 300 secs causes them to happen every 10
mins in quieter periods; is that an occaisional update occurring?
- why checkpoint only releases single Wal file each time - but that
maybe me just reading the code incorrectly.

Please set WAL_DEBUG to 1 so we can see a bit more info: thanks.

 Can I get any feedback on this ? How to look into pg's WAL usage 

Re: [PERFORM] WAL Optimisation - configuration and usage

2004-03-01 Thread Josh Berkus
Simon,

 Please set WAL_DEBUG to 1 so we can see a bit more info: thanks.

I'm pretty sure that WAL_DEBUG requires a compile-time option.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] WAL Optimisation - configuration and usage

2004-02-28 Thread Richard Huxton
Rob

Sir - I have to congratulate you on having the most coherently summarised and 
yet complex list query I have ever seen. 

I fear that I will be learning from this problem rather than helping, but one 
thing did puzzle me - you've set your random_page_cost to 0.5? I'm not sure 
this is sensible - you may be compensating for some other parameter 
out-of-range.


-- 
  Richard Huxton

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org