On 3/24/13 8:11 AM, Greg Smith wrote:
On 3/22/13 8:45 AM, Ants Aasma wrote:
However, I think the main issue isn't finding new algorithms that are
better in some specific circumstances. The hard part is figuring out
whether their performance is better in general.

Right.  The current page replacement method works as expected.  Many frequently 
accessed pages accumulate a usage count of 5 before the clock sweep hits them.  
Pages that are accessed once and not again before the clock sweep are evicted.  
There are several theoretically better ways to approach this.  Anyone who 
hasn't already been working on this for a few years is very unlikely to come up 
with a brand new idea, one that hasn't already been tested in the academic 
research.

But the real blocker here isn't ideas, it's creating benchmark workloads to validate any 
change.  Right now I see the most promising work that could lead toward the 
"performance farm" idea as all of the Jenkins based testing that's been going 
on recently.  Craig Ringer has using that for 2ndQuadrant work here, Peter Eisentraut has 
been working with it: 
http://petereisentraut.blogspot.com/2013/01/postgresql-and-jenkins.html and the PostGIS 
project uses it too.  There's some good momentum brewing there.

When we have regular performance testing with a mix of workloads--I have about 
10 in mind to start--at that point we could start the testing performance 
changes to the buffer replacement.  Until then this whole area is hard to touch 
usefully.  You have to assume that any tuning you do for one type of workload 
might accidentally slow another.  Starting with a lot of baseline workloads is 
the only way to move usefully forward when facing that problem.

The other thing I think would be tremendously useful would be the ability to 
get performance data from systems in the field *without having to install extra 
stuff or do a special build*. The last point is critical because there are so 
many places where deviating from a standard package takes an act of Congress.

In this case, if I could run some queries to get stats about clock sweep waits 
and what-not then I could get our shared buffer size changed on some hosts and 
see how those changes affect the numbers. But doing this with a non-standard 
build is pretty much a non-starter.

I know there's been some improvement in this area, but I suspect there's still 
more to go.
--
Jim C. Nasby, Data Architect                   j...@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to