Re: [PERFORM] The shared buffers challenge

2011-05-31 Thread Merlin Moncure
On Fri, May 27, 2011 at 7:19 PM, Jeff Davis pg...@j-davis.com wrote:
 On Thu, 2011-05-26 at 09:31 -0500, Merlin Moncure wrote:
 Where they are most helpful is for masking of i/o if
 a page gets dirtied 1 times before it's written out to the heap

 Another possible benefit of higher shared_buffers is that it may reduce
 WAL flushes. A page cannot be evicted from shared_buffers until the WAL
 has been flushed up to the page's LSN (WAL before data); so if there
 is memory pressure to evict dirty buffers, it may cause extra WAL
 flushes.

 I'm not sure what the practical effects of this are, however, but it
 might be an interesting thing to test.

Hm, I bet it could make a fairly big difference if wal data is not on
a separate volume.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-28 Thread Greg Smith

On 05/27/2011 07:30 PM, Mark Kirkwood wrote:
Greg, having an example with some discussion like this in the docs 
would probably be helpful.


If we put that example into the docs, two years from now there will be 
people showing up here saying I used the recommended configuration from 
the docs that cut and paste it into their postgresql.conf, turning   
autovacuum off and everything.  Periodically people used to publish 
recommended postgresql.conf settings on random web pages, sometimes 
with poor suggestions, and those things kept showing up in people's 
configurations posted to the lists here for long after they were no 
longer applicable.  I've resisted publishing specific configuration 
examples in favor of working on pgtune specifically because of having 
observed that.


There's several new small features in 9.1 that make it a easier to 
instrument checkpoint behavior and how it overlaps with shared_buffers 
increases:  summary of sync times, ability to reset pg_stat_bgwriter, 
and a timestamp on when it was last reset.  It's not obvious at all how 
those all stitch together into some new tuning approaches, but they do.  
Best example I've given so far is at 
http://archives.postgresql.org/pgsql-hackers/2011-02/msg00209.php ; look 
at how I can turn the mysterious buffers_backend field into something 
measured in MB/s using these new features.


That's the direction all this is moving toward.  If it's easy for people 
to turn the various buffer statistics into human-readable form, the way 
tuning changes impact the server operation becomes much easier to see.  
Documenting the much harder to execute methodology you can apply to 
earlier versions isn't real exciting to me at this point.  I've done 
that enough that people who want the info can find it, even if it's not 
all in the manual.  The ways you can do it in 9.1 are so much easier, 
and more accurate in regards to the sync issues, that I'm more 
interested in beefing up the manual in regards to using them at this point.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-27 Thread Merlin Moncure
On Thu, May 26, 2011 at 6:10 PM, Greg Smith g...@2ndquadrant.com wrote:
 Merlin Moncure wrote:

 So, the challenge is this: I'd like to see repeatable test cases that
 demonstrate regular performance gains  20%.  Double bonus points for
 cases that show gains  50%.

 Do I run around challenging your suggestions and giving you homework?  You
 have no idea how much eye rolling this whole message provoked from me.

That's just plain unfair: I didn't challenge your suggestion nor give
you homework. In particular, I'm not suggesting the 25%-ish default is
wrong -- but trying to help people understand why it's there and what
it's doing.  I bet 19 people out of 20 could not explain what the
primary effects of shared_buffers with any degree of accuracy.  That
group of people in fact would have included me until recently, when I
started studying bufmgr.c for mostly unrelated reasons.  Understand my
basic points:

*) the documentation should really explain this better (in particular,
it should debunk the myth 'more buffers = more caching')
*) the 'what' is not nearly so important as the 'why'
*) the popular understanding of what buffers do is totally, completely, wrong
*) I'd like to gather cases to benchmark changes that interact with
these settings
*) I think you are fighting the 'good fight'. I'm trying to help

 OK, so the key thing to do is create a table such that shared_buffers is
 smaller than the primary key index on a table, then UPDATE that table
 furiously.  This will page constantly out of the buffer cache to the OS one,
 doing work that could be avoided.  Increase shared_buffers to where it fits
 instead, and all the index writes are buffered to write only once per
 checkpoint.  Server settings to exaggerate the effect:

This is exactly what I'm looking for...that's really quite striking.
I knew that buffer 'hit' before it goes out the door is what to gun
for.

 As for figuring out how this impacts more complicated cases, I hear somebody
 wrote a book or something that went into pages and pages of detail about all
 this.  You might want to check it out.

so i've heard: http://imgur.com/lGOqx (and yes: I 100% endorse the
book: everyone who is serious about postgres should own a copy).
Anyways, double points to you ;-).

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-27 Thread Scott Carey
So how far do you go?  128MB?  32MB?  4MB?

Anecdotal and an assumption, but I'm pretty confident that on any server
with at least 1GB of dedicated RAM, setting it any lower than 200MB is not
even going to help latency (assuming checkpoint and log configuration is
in the realm of sane, and connections*work_mem is sane).

The defaults have been so small for so long on most platforms, that any
increase over the default generally helps performance -- and in many cases
dramatically.  So if more is better, then most users assume that even more
should be better.
But its not so simple, there are drawbacks to a larger buffer and
diminishing returns with larger size.  I think listing the drawbacks of a
larger buffer and symptoms that can result would be a big win.

And there is an OS component to it too.  You can actually get away with
shared_buffers at 90% of RAM on Solaris.  Linux will explode if you try
that (unless recent kernels have fixed its shared memory accounting).


On 5/26/11 8:10 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote:

Merlin Moncure mmonc...@gmail.com wrote:
 
 So, the challenge is this: I'd like to see repeatable test cases
 that demonstrate regular performance gains  20%.  Double bonus
 points for cases that show gains  50%.
 
Are you talking throughput, maximum latency, or some other metric?
 
In our shop the metric we tuned for in reducing shared_buffers was
getting the number of fast queries (which normally run in under a
millisecond) which would occasionally, in clusters, take over 20
seconds (and thus be canceled by our web app and present as errors
to the public) down to zero.  While I know there are those who care
primarily about throughput numbers, that's worthless to me without
maximum latency information under prolonged load.  I'm not talking
90th percentile latency numbers, either -- if 10% of our web
requests were timing out the villagers would be coming after us with
pitchforks and torches.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-27 Thread Kevin Grittner
Scott Carey sc...@richrelevance.com wrote:
 
 So how far do you go?  128MB?  32MB?  4MB?
 
Under 8.2 we had to keep shared_buffers less than the RAM on our BBU
RAID controller, which had 256MB -- so it worked best with
shared_buffers in the 160MB to 200MB range.  With 8.3 we found that
anywhere from 512MB to 1GB performed better without creating
clusters of stalls.  In both cases we also had to significantly
boost the aggressiveness of the background writer.
 
Since the sweet spot is so dependent on such things as your RAID
controller and your workload, I *highly* recommend Greg's
incremental tuning approach.  The rough guidelines which get tossed
about make reasonable starting points, but you really need to make
relatively small changes with the actual load you're trying to
optimize and monitor the metrics which matter to you.  On a big data
warehouse you might not care if the database becomes unresponsive
for a couple minutes every now and then if it means better overall
throughput.  On a web server, you may not have much problem keeping
up with the overall load, but want to ensure reasonable response
time.
 
 Anecdotal and an assumption, but I'm pretty confident that on any
 server with at least 1GB of dedicated RAM, setting it any lower
 than 200MB is not even going to help latency (assuming checkpoint
 and log configuration is in the realm of sane, and
 connections*work_mem is sane).
 
I would add the assumption that you've got at least 256MB BBU cache
on your RAID controller.
 
 The defaults have been so small for so long on most platforms,
 that any increase over the default generally helps performance --
 and in many cases dramatically.
 
Agreed.
 
 So if more is better, then most users assume that even more should
 be better.
 
That does seem like a real risk.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-27 Thread Greg Smith

Scott Carey wrote:

And there is an OS component to it too.  You can actually get away with
shared_buffers at 90% of RAM on Solaris.  Linux will explode if you try
that (unless recent kernels have fixed its shared memory accounting).
  


You can use much larger values for shared_buffers on Solaris with UFS as 
the filesystem than almost anywhere else, as you say.  UFS defaults to 
caching an extremely tiny amount of memory by default.  Getting 
PostgreSQL to buffer everything therefore leads to minimal 
double-caching and little write caching that creates checkpoint spikes, 
so 90% is not impossible there.


If you're using ZFS instead, that defaults to similar aggressive caching 
as Linux.  You may even have to turn that down if you want the database 
to have a large amount of memory for its own use even with normal levels 
of sizing; just space for shared_buffers and work_mem can end up being 
too large of a pair of competitors for caching RAM.  ZFS is not really 
not tuned all that differently from how Linux approaches caching in that 
regard.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-27 Thread Greg Smith

Merlin Moncure wrote:

That's just plain unfair: I didn't challenge your suggestion nor give
you homework.


I was stuck either responding to your challenge, or leaving the 
impression I hadn't done the research to back the suggestions I make if 
I didn't.  That made it a mandatory homework assignment for me, and I 
didn't appreciate that.




*) the documentation should really explain this better (in particular,
it should debunk the myth 'more buffers = more caching'


Any attempt to make a serious change to the documentation around 
performance turns into a bikeshedding epic, where the burden of proof to 
make a change is too large to be worth the trouble to me anymore.  I 
first started publishing tuning papers outside of the main docs because 
it was the path of least resistance to actually getting something useful 
in front of people.  After failing to get even basic good 
recommendations for checkpoint_segments into the docs, I completely gave 
up on focusing there as my primary way to spread this sort of information.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-27 Thread Merlin Moncure
On Fri, May 27, 2011 at 1:47 PM, Greg Smith g...@2ndquadrant.com wrote:
 Merlin Moncure wrote:

 That's just plain unfair: I didn't challenge your suggestion nor give
 you homework.

 I was stuck either responding to your challenge, or leaving the impression I
 hadn't done the research to back the suggestions I make if I didn't.  That
 made it a mandatory homework assignment for me, and I didn't appreciate
 that.

really -- that wasn't my intent.  in any event, i apologize.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-27 Thread Robert Haas
On Fri, May 27, 2011 at 2:47 PM, Greg Smith g...@2ndquadrant.com wrote:
 Any attempt to make a serious change to the documentation around performance
 turns into a bikeshedding epic, where the burden of proof to make a change
 is too large to be worth the trouble to me anymore.  I first started
 publishing tuning papers outside of the main docs because it was the path of
 least resistance to actually getting something useful in front of people.
  After failing to get even basic good recommendations for
 checkpoint_segments into the docs, I completely gave up on focusing there as
 my primary way to spread this sort of information.

Hmm.  That's rather unfortunate.  +1 for revisiting that topic, if you
have the energy for it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-27 Thread Maciek Sakrejda
  After failing to get even basic good recommendations for
 checkpoint_segments into the docs, I completely gave up on focusing there as
 my primary way to spread this sort of information.

 Hmm.  That's rather unfortunate.  +1 for revisiting that topic, if you
 have the energy for it.

Another +1. While I understand that this is not simple, many users
will not look outside of standard docs, especially when first
evaluating PostgreSQL. Merlin is right that the current wording does
not really mention a down side to cranking shared_buffers on a system
with plenty of RAM.

---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-27 Thread Claudio Freire
On Fri, May 27, 2011 at 9:24 PM, Maciek Sakrejda msakre...@truviso.com wrote:
 Another +1. While I understand that this is not simple, many users
 will not look outside of standard docs, especially when first
 evaluating PostgreSQL. Merlin is right that the current wording does
 not really mention a down side to cranking shared_buffers on a system
 with plenty of RAM.

If you read the whole docs it does.

If you read caching, checkpoints, WAL, all of it, you can connect the dots.
It isn't easy, but database management isn't easy.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] The shared buffers challenge

2011-05-26 Thread Merlin Moncure
Hello performers, I've long been unhappy with the standard advice
given for setting shared buffers.  This includes the stupendously
vague comments in the standard documentation, which suggest certain
settings in order to get 'good performance'.  Performance of what?
Connection negotiation speed?  Not that it's wrong necessarily, but
ISTM too much based on speculative or anecdotal information.  I'd like
to see the lore around this setting clarified, especially so we can
refine advice to: 'if you are seeing symptoms x,y,z set shared_buffers
from a to b to get symptom reduction of k'.  I've never seen a
database blow up from setting them too low, but over the years I've
helped several people with bad i/o situations or outright OOM
conditions from setting them too high.

My general understanding of shared_buffers is that they are a little
bit faster than filesystem buffering (everything these days is
ultimately based on mmap AIUI, so there's no reason to suspect
anything else).  Where they are most helpful is for masking of i/o if
a page gets dirtied 1 times before it's written out to the heap, but
seeing any benefit from that at all is going to be very workload
dependent.  There are also downsides using them instead of on the heap
as well, and the amount of buffers you have influences checkpoint
behavior.  So things are complex.

So, the challenge is this: I'd like to see repeatable test cases that
demonstrate regular performance gains  20%.  Double bonus points for
cases that show gains  50%.  No points given for anecdotal or
unverifiable data. Not only will this help raise the body of knowledge
regarding the setting, but it will help produce benchmarking metrics
against which we can measure multiple interesting buffer related
patches in the pipeline.  Anybody up for it?

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Kevin Grittner
Merlin Moncure mmonc...@gmail.com wrote:
 
 So, the challenge is this: I'd like to see repeatable test cases
 that demonstrate regular performance gains  20%.  Double bonus
 points for cases that show gains  50%.
 
Are you talking throughput, maximum latency, or some other metric?
 
In our shop the metric we tuned for in reducing shared_buffers was
getting the number of fast queries (which normally run in under a
millisecond) which would occasionally, in clusters, take over 20
seconds (and thus be canceled by our web app and present as errors
to the public) down to zero.  While I know there are those who care
primarily about throughput numbers, that's worthless to me without
maximum latency information under prolonged load.  I'm not talking
90th percentile latency numbers, either -- if 10% of our web
requests were timing out the villagers would be coming after us with
pitchforks and torches.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Merlin Moncure
On Thu, May 26, 2011 at 10:10 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Merlin Moncure mmonc...@gmail.com wrote:

 So, the challenge is this: I'd like to see repeatable test cases
 that demonstrate regular performance gains  20%.  Double bonus
 points for cases that show gains  50%.

 Are you talking throughput, maximum latency, or some other metric?

I am talking about *any* metric..you've got something, let's see it.
But it's got to be verifiable, so no points scored.

See my note above about symptoms -- if your symptom of note happens to
be unpredictable spikes in fast query times under load, then I'd like
to scribble that advice directly into the docs along with (hopefully)
some reasoning of exactly why more database managed buffers are
helping.   As noted, I'm particularly interested in things we can test
outside of production environments, since I'm pretty skeptical the
Wisconsin Court System is going to allow the internet to log in and
repeat and verify test methodologies.  Point being: cranking buffers
may have been the bee's knees with, say, the 8.2 buffer manager, but
present and future improvements may have render that change moot or
even counter productive.  I doubt it's really changed much, but we
really need to do better on this -- all else being equal, the lowest
shared_buffers setting possible without sacrificing performance is
best because it releases more memory to the o/s to be used for other
things -- so everthing's bigger in Texas type approaches to
postgresql.conf manipulation (not that I see that here of course) are
not necessarily better :-).

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Claudio Freire
On Thu, May 26, 2011 at 5:36 PM, Merlin Moncure mmonc...@gmail.com wrote:
 Point being: cranking buffers
 may have been the bee's knees with, say, the 8.2 buffer manager, but
 present and future improvements may have render that change moot or
 even counter productive.

I suggest you read the docs on how shared buffers work, because,
reasonably, it would be all the way around.

Recent improvments into how postgres manage its shared buffer pool
makes them better than the OS cache, so there should be more incentive
to increase them, rather than decrease them.

Workload conditions may make those improvements worthless, hinting
that you should decrease them.

But you have to know your workload and you have to know how the shared
buffers work.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Merlin Moncure
On Thu, May 26, 2011 at 10:45 AM, Claudio Freire klaussfre...@gmail.com wrote:
 On Thu, May 26, 2011 at 5:36 PM, Merlin Moncure mmonc...@gmail.com wrote:
 Point being: cranking buffers
 may have been the bee's knees with, say, the 8.2 buffer manager, but
 present and future improvements may have render that change moot or
 even counter productive.

 I suggest you read the docs on how shared buffers work, because,
 reasonably, it would be all the way around.

 Recent improvments into how postgres manage its shared buffer pool
 makes them better than the OS cache, so there should be more incentive
 to increase them, rather than decrease them.

 Workload conditions may make those improvements worthless, hinting
 that you should decrease them.

 But you have to know your workload and you have to know how the shared
 buffers work.

I am not denying that any of those things are the case, although your
assumption that I haven't read the documentation was obviously not
grounded upon research.  What you and I know/don't know is not the
point.  The point is what we can prove, because going through the
motions of doing that is useful.  You are also totally missing my
other thrust, which is that future changes to how things work could
change the dynamics of .conf configuration -- btw not for the first
time in the history of the project.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Claudio Freire
On Thu, May 26, 2011 at 6:02 PM, Merlin Moncure mmonc...@gmail.com wrote:
 The point is what we can prove, because going through the
 motions of doing that is useful.

Exactly, and whatever you can prove will be workload-dependant.
So you can't prove anything generally, since no single setting is
best for all.

 You are also totally missing my
 other thrust, which is that future changes to how things work could
 change the dynamics of .conf configuration

Nope, I'm not missing it, simply not commenting on it.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Merlin Moncure
On Thu, May 26, 2011 at 11:37 AM, Claudio Freire klaussfre...@gmail.com wrote:
 On Thu, May 26, 2011 at 6:02 PM, Merlin Moncure mmonc...@gmail.com wrote:
 The point is what we can prove, because going through the
 motions of doing that is useful.

 Exactly, and whatever you can prove will be workload-dependant.
 So you can't prove anything generally, since no single setting is
 best for all.

Then we should stop telling people to adjust it unless we can match
the workload to the improvement.  There are some people here who can
do that as if by magic, but that's not the issue.  I'm trying to
understand the why it works better for some than for others.  What's
frustrating is simply believing something is the case, without trying
to understand why.  How about, instead of arguing with me, coming up
with something for the challenge?

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Kevin Grittner
Merlin Moncure mmonc...@gmail.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Merlin Moncure mmonc...@gmail.com wrote:

 So, the challenge is this: I'd like to see repeatable test cases
 that demonstrate regular performance gains  20%.  Double bonus
 points for cases that show gains  50%.

 Are you talking throughput, maximum latency, or some other
 metric?
 
 I am talking about *any* metric..you've got something, let's see
 it.  But it's got to be verifiable, so no points scored.
 
Oh, that wasn't to score points; just advocating for more than a
one-dimensional view of performance.  I'm adding to your demands,
not attempting to satisfy them.  :-)
 
 See my note above about symptoms -- if your symptom of note
 happens to be unpredictable spikes in fast query times under load,
 then I'd like to scribble that advice directly into the docs along
 with (hopefully) some reasoning of exactly why more database
 managed buffers are helping.
 
In our case it was *fewer* shared_buffers which helped.
 
 As noted, I'm particularly interested in things we can test
 outside of production environments, since I'm pretty skeptical the
 Wisconsin Court System is going to allow the internet to log in
 and repeat and verify test methodologies.
 
Right, while it was a fairly scientific and methodical test, it was
against a live production environment.  We adjusted parameters
incrementally, a little each day, from where they had been toward
values which were calculated in advance to be better based on our
theory of the problem (aided in no small part by analysis and advice
from Greg Smith), and saw a small improvement each day with the
problem disappearing entirely right at the target values we had
calculated in advance.  :-)
 
 Point being: cranking buffers may have been the bee's knees with,
 say, the 8.2 buffer manager, but present and future improvements
 may have render that change moot or even counter productive.
 
We did find that in 8.3 and later we can support a larger
shared_buffer setting without the problem than in 8.2 and earlier. 
We still need to stay on the low side of what is often advocated to
keep the failure rate from this issue at zero.
 
 all else being equal, the lowest shared_buffers setting possible
 without sacrificing performance is best because it releases more
 memory to the o/s to be used for other things
 
I absolutely agree with this.
 
I think the problem is that it is very tedious and time-consuming to
construct artificial tests for these things.  Greg Smith has spent a
lot of time and done a lot of research investigating the dynamics of
these issues, and recommends a process of incremental adjustments
for tuning the relevant settings which, in my opinion, is going to
be better than any generalized advice on settings.
 
Don't get me wrong, I would love to see numbers which earned
points under the criteria you outline.  I would especially love it
if they could be part of the suite of tests in our performance farm.
I just think that the wealth of anecdotal evidence and the dearth of
repeatable benchmarks in this area is due to the relatively low-cost
techniques available to tune production systems to solve pressing
needs versus the relatively high cost of creating repeatable test
cases (without, by the way, solving an immediate need).
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Greg Smith

Merlin Moncure wrote:

So, the challenge is this: I'd like to see repeatable test cases that
demonstrate regular performance gains  20%.  Double bonus points for
cases that show gains  50%.


Do I run around challenging your suggestions and giving you homework?  
You have no idea how much eye rolling this whole message provoked from me.


OK, so the key thing to do is create a table such that shared_buffers is 
smaller than the primary key index on a table, then UPDATE that table 
furiously.  This will page constantly out of the buffer cache to the OS 
one, doing work that could be avoided.  Increase shared_buffers to where 
it fits instead, and all the index writes are buffered to write only 
once per checkpoint.  Server settings to exaggerate the effect:


shared_buffers = 32MB
checkpoint_segments = 256
log_checkpoints = on
autovacuum = off

Test case:

createdb pgbench
pgbench -i -s 20 pgbench
psql -d pgbench -c select 
pg_size_pretty(pg_relation_size('public.pgbench_accounts_pkey'))

psql -c select pg_stat_reset_shared('bgwriter')
pgbench -T 120 -c 4 -n pgbench
psql -x -c SELECT * FROM pg_stat_bgwriter

This gives the following size for the primary key and results:

pg_size_pretty

34 MB

transaction type: TPC-B (sort of)
scaling factor: 20
query mode: simple
number of clients: 4
number of threads: 1
duration: 120 s
number of transactions actually processed: 13236
tps = 109.524954 (including connections establishing)
tps = 109.548498 (excluding connections establishing)

-[ RECORD 1 ]-+--
checkpoints_timed | 0
checkpoints_req   | 0
buffers_checkpoint| 0
buffers_clean | 16156
maxwritten_clean  | 131
buffers_backend   | 5701
buffers_backend_fsync | 0
buffers_alloc | 25276
stats_reset   | 2011-05-26 18:39:57.292777-04

Now, change so the whole index fits instead:

shared_buffers = 512MB

...which follows the good old 25% of RAM guidelines given this system 
has 2GB of RAM.  Restart the server, repeat the test case.  New results:


transaction type: TPC-B (sort of)
scaling factor: 20
query mode: simple
number of clients: 4
number of threads: 1
duration: 120 s
number of transactions actually processed: 103440
tps = 861.834090 (including connections establishing)
tps = 862.041716 (excluding connections establishing)

gsmith@meddle:~/personal/scripts$ psql -x -c SELECT * FROM 
pg_stat_bgwriter

-[ RECORD 1 ]-+--
checkpoints_timed | 0
checkpoints_req   | 0
buffers_checkpoint| 0
buffers_clean | 0
maxwritten_clean  | 0
buffers_backend   | 1160
buffers_backend_fsync | 0
buffers_alloc | 34071
stats_reset   | 2011-05-26 18:43:40.887229-04

Rather than writing 16156+5701=21857 buffers out during the test to 
support all the index churn, instead only 1160 buffers go out, 
consisting mostly of the data blocks for pgbench_accounts that are being 
updated irregularly.  With less than 1 / 18th as I/O to do, the system 
executes nearly 8X as many UPDATE statements during the test run.


As for figuring out how this impacts more complicated cases, I hear 
somebody wrote a book or something that went into pages and pages of 
detail about all this.  You might want to check it out.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Samuel Gendler
On Thu, May 26, 2011 at 4:10 PM, Greg Smith g...@2ndquadrant.com wrote:


 As for figuring out how this impacts more complicated cases, I hear
 somebody wrote a book or something that went into pages and pages of detail
 about all this.  You might want to check it out.


I was just going to suggest that there was significant and detailed
documentation of this stuff in a certain book, a well-thumbed copy of which
should be sitting on the desk of anyone attempting any kind of postgres
performance tuning.