Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Simon Riggs
On Tue, 2007-03-27 at 20:48 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: It seems possible to reduce overall WAL volume by roughly 25% on common workloads by optimising the way UPDATE statements generate WAL. This seems a huge amount of work to optimize *one* benchmark.

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Kenneth Marshall
On Wed, Mar 28, 2007 at 08:07:14AM +0100, Simon Riggs wrote: On Tue, 2007-03-27 at 20:48 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: It seems possible to reduce overall WAL volume by roughly 25% on common workloads by optimising the way UPDATE statements generate WAL.

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Tom Lane
Kenneth Marshall [EMAIL PROTECTED] writes: We use DSPAM as one of our anti-spam options. Its UPDATE pattern is to increment a spam counter or a not-spam counter while keeping the user and token information the same. This would benefit from this optimization. Would it? How wide is the user and

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Florian G. Pflug
Kenneth Marshall wrote: We use DSPAM as one of our anti-spam options. Its UPDATE pattern is to increment a spam counter or a not-spam counter while keeping the user and token information the same. This would benefit from this optimization. Currently we are forced to use MySQL with MyISM tables

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Tom Lane
Kenneth Marshall [EMAIL PROTECTED] writes: On Wed, Mar 28, 2007 at 09:46:30AM -0400, Tom Lane wrote: Would it? How wide is the user and token information? Sorry about the waste of time. I just noticed that the proposal is only for rows over 128 bytes. The token definition is: CREATE TABLE

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 15:51 +0200, Florian G. Pflug wrote: BTW, the COMMIT NOWAIT feature Simon Riggs proposed should provide a huge speedup too, since dspam runs one transaction for each token it has to update. I've switched to doing the COMMIT NOWAIT as a priority now, but do plan to do

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes: Yeah, this illustrates my concern that the proposal is too narrowly focused on a specific benchmark. A lot of the recently proposed changes don't really fit in the optimizations category very well at all. I think of them more as avoiding pitfalls. Currently

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Joshua D. Drake
Gregory Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Yeah, this illustrates my concern that the proposal is too narrowly focused on a specific benchmark. A lot of the recently proposed changes don't really fit in the optimizations category very well at all. I think of them more as

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes: A lot of the recently proposed changes don't really fit in the optimizations category very well at all. I think of them more as avoiding pitfalls. Well, we can't put a major amount of complexity into the system for each possible pitfall. This one is

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread August Zajonc
Simon Riggs wrote: On Tue, 2007-03-27 at 20:48 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: It seems possible to reduce overall WAL volume by roughly 25% on common workloads by optimising the way UPDATE statements generate WAL. This seems a huge amount of work to optimize

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 10:51 -0400, Tom Lane wrote: Kenneth Marshall [EMAIL PROTECTED] writes: On Wed, Mar 28, 2007 at 09:46:30AM -0400, Tom Lane wrote: Would it? How wide is the user and token information? Sorry about the waste of time. I just noticed that the proposal is only for rows

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes: Actually, PG does extremely well on that in the situation where the static data is *really* wide, ie, wide enough to be toasted out-of-line. Simon's proposal will only help for an intermediate range of situations where the row is wide but not very wide.

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes: The reason I think this is idea is exciting is that later I would suggest applying it to HOT updates. Having to keep a spare tuple's worth of space in every page is pretty annoying. But if we could get by with the average half-tuple dead space to do an

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Zeugswetter Andreas ADI SD
I agree that these values need a second look. I think a TOAST_TUPLE_THRESHOLD well smaller than the current value would still easily pay its way. With a little caution to avoid wasting too much effort on the last few bytes I suspect even as low as 400-500 bytes is probably

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Tom Lane
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: Btw: Do we consider the existance of toasted columns in the seq-scan cost estimation ? Not at present. There was some discussion of this but it seems like a fair amount of work --- we don't currently track statistics on how many of a

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 11:17 -0400, Tom Lane wrote: This one is similar, if you keep a bunch of static data attached to some small dynamic data your WAL and table bloats. Actually, PG does extremely well on that in the situation where the static data is *really* wide, ie, wide enough to be

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-27 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: It seems possible to reduce overall WAL volume by roughly 25% on common workloads by optimising the way UPDATE statements generate WAL. This seems a huge amount of work to optimize *one* benchmark. If it weren't so narrowly focused on the properties of a